Adaptive filter

ABSTRACT

An adaptive filter is implemented by a computer ( 10 ) processing an input signal using a recursive least squares lattice (RLSL) algorithm ( 12 ) to obtain forward and backward least squares prediction residuals. A prediction residual is the difference between a data element in a sequence of elements and a prediction of that element from other sequence elements. Forward and backward residuals are converted at ( 14 ) to interpolation residuals which are unnormalized Kalman gain vector coefficients. Interpolation residuals are normalized to produce the Kalman gain vector at ( 16 ). The Kalman gain vector is combined at ( 18 ) with input and reference signals x(t) and y(t), which provides updates for the filter coefficients or weights to reflect these signals as required to provide adaptive filtering.

This invention relates to an adaptive filter and to a method and acomputer program for implementing it.

Filtering is one of the most common procedures in signal processing. Afilter processes a signal or signals at its input to produce an outputsignal having certain desirable characteristics. The exact form whichthese characteristics take is determined by the values taken by certainfilter parameters. In a linear filter, the output is a weighted linearcombination of the input signals and delayed versions thereof asgenerated by a tapped delay-line. The filter has controllable parameterswhich are weights used to weight this linear combination.

One known form of filter is a time-series filter (or transversal filter,which might be adaptive) with only one input signal. The filter forms alinear combination of delayed versions of the signal. The filter weightsmay be chosen to select or reject different frequency components of theinput signal. Another form of filter is a phased array radar beamformer:in receive mode such a beamformer receives signals from an array ofspatially separated sensors, the signals being generated in response toreception of radar waves reflected from a scene. The filter weights maybe chosen, for example, to select or reject components of the sensorarray signals depending on the radar wave direction of arrival at thesensors: such a beamformer is said to be a spatial filter because of itsspatially selective properties. A sonar array beamformer is a relatedexample of a spatial filter: here in receive mode signals from an arrayof sonar transducers and delayed versions of these signals constitutethe input to the filter. By varying the filter weights, the sonarbeamformer can select or reject signals based on both direction ofarrival and frequency. Radar and sonar beamformers are also operative ina transmit mode in which filter techniques may be used to configure thetransmitted beam.

A finite impulse response (FIR) filter, or transversal filter, is aknown form of filter which may be implemented in hardware as a tappeddelay line comprising a series-connected set of clock-activatedregisters each associated with a respective multiplier andmultiplicative coefficient. Increasingly though FIR filters areimplemented equivalently in software running on a computer system andaccepting data from a sensor after conversion from analogue to digitalelectronic form. In a delay line version, a sequence of data elementsx(i), where i denotes a clock cycle index, is clocked along the line,each register holding one element at a time. The total number m of dataelements is very much larger than the number N of registers, so only Nsuccessive elements occupy the line at any time. The line is occupied bydata elements x(1) to x(N) on the Nth clock cycle. Clocking the lineadvances the sequence by one register, so that the leading elementleaves one end of the line and a new element is introduced at the other.On the pth clock cycle the line is occupied by data elements x(p−N+1) tox(p). On each clock cycle the data element in each register ismultiplied by the respective filter coefficient in the associatedmultiplier in each case (w_(i) for the ith register, i=1 to N), and theresulting products are added to form a result which is Z_(N)(p) for thepth cycle.I.e.:

$\begin{matrix}{{z_{N}(p)} = {\sum\limits_{i = 1}^{N}{w_{i}{x\left( {p - i + 1} \right)}}}} & (1)\end{matrix}$

Equation (1) may alternatively be expressed as the scalar (dot) productof a weight vector w having elements w₁ to w_(N) (the transversal filtercoefficients or weights) and a data vector x_(N)(p) having elements x(p)to x(p−N+1):z _(N)(p)=w ^(T) x _(N)(p)  (2)where the superscript index T indicates a vector transpose.

It is a common requirement to obtain meaningful information from a setof data containing unwanted noise or other corrupting components inaddition to a useful signal, and it is known to use linear filtering forthis to improve signal to noise ratio. However, the choice of filter isdifficult if not impossible if noise and signal characteristics areunknown: it then becomes desirable to use an adaptive filter withcharacteristics which are modifiable in response to data input to thesystem: in this regard filter adaptivity is important so that the filterproperties will be determined from input data to be filtered, instead ofa priori from making an estimate or assumption regarding the datacharacteristics. Filter adaptivity is realised by an adaptive filterwith coefficients and therefore also properties which can be changed.

Adaptive filters have applications in fields such as radar and sonar asmentioned above, and also in communications and processing of images andspeech. They can be used inter alia to perform channel equalisation,signal detection (i.e. matched filtering), acoustic echo cancellationand adaptive noise cancellation.

A typical example of an adaptive filter is that of an echo canceller ina ‘hands-free’ telephone system. Here a telephone microphone picks upnot only speech from a person making the call, but also that of arecipient of the call whose speech is output by the telephone'sloudspeaker. If no corrective action is taken the recipient's speech isalso sent to the recipient and appears as an echo. If the recipient isusing a hands-free telephone then the reply may be returned to thecaller and positive feedback or ‘howl around’ may result.

The echo may be removed by an adaptive filter. The telephone'sloudspeaker is driven by a speech signal x(t) which becomes an inputsignal to the filter, and a reference signal y(t) is picked up by thetelephone's microphone. The filter coefficients are adjusted adaptivelyto cancel the echo signal in y(t). Once the echo is removed, theresidual signal, which contains only the caller's speech, is transmittedto the telephone call recipient. This residual signal may be termed anerror signal e_(N)(t), and it is used in adjusting the filtercoefficients as will be described later.

There has been considerable work on the development of digital signalprocessing algorithms for adaptive filtering. There are two broadclasses of algorithm: block-based algorithms and recursive algorithms.The former are used to process data in finite length blocks and thelatter are used for processing a continuous data stream. The choice ofalgorithm depends on the intended field of use but for many problemsrecursive algorithms are more attractive.

For many years only two types of recursive adaptive filtering algorithmswere known: Stochastic Gradient Descent (SGD) algorithms and RecursiveLeast Squares (RLS) algorithms. More recently recursive adaptivefiltering algorithms of a new type have been introduced, and can beviewed as being hybrids of SGD and RLS algorithms.

An SGD algorithm (as typified by the well known Least Mean Squares (LMS)algorithm) requires very few mathematical operations per time instantand is robust to finite precision numerical effects, i.e. largefractional errors in small digital quantities; unfortunately the LMSalgorithm is, in most cases, very slow to converge to an optimumsolution.

RLS algorithms on the other hand are very quick to converge. Howeverthey require more mathematical operations per unit time, and this canmake them completely impractical. For example, a filter for a hands-freetelephone would require 4,000,000 floating-point arithmetic operations(flops) in one clock cycle of 100 μsec, equivalent to 4×10¹⁰ flops/sec,which is not currently achievable using a single computer processor.

Furthermore RLS algorithms may exhibit mathematical instability; i.e.they may fail to operate properly due finite precision numericaleffects, such as division by small inaccurate quantities: e.g. aneight-bit number may be inaccurate as regards the value of the leastsignificant bit; if all bits of this number of other than the leastsignificant bit are zero, i.e. the number is 00000001, then inaccuracycan lead to it becoming 00000000: the calculation may then involvedivision by zero.

A further consideration is that some RLS algorithms do not calculate therequired filter coefficients explicitly but instead related quantitieswhich are less useful for some applications.

A new class of hybrid algorithm was introduced in order to address theconflicting problems of convergence speed and computational burden.Examples are the Affine Projection algorithm and the Fast Newtonalgorithm. Here an LMS-type algorithm and an RLS algorithm work togetherto solve the adaptive filtering problem. The RLS algorithm is used tosolve a problem that is much smaller than the original adaptive filterproblem and hence does not present too much additional computationalload. Together the LMS and RLS algorithms allow the adaptive filter toconverge more quickly than the LMS on its own but usually not as fast asa proper RLS-based adaptive filter. These algorithms suffer from thefact that the RLS component to the algorithm can become numericallyunstable.

At present there are three classes of RLS algorithms: quadratic RLSalgorithms, fast transversal filter algorithms and RLS latticealgorithms. Quadratic RLS algorithms can be numerically stable but theyhave the disadvantage that, for a problem with N unknown coefficients tobe determined, they require a number of mathematical operations perclock cycle that is proportional to N². For applications with large N(e.g. the acoustic echo cancellation problem where typically N=2000)this results in expensive hardware or is even impractical given currenttechnology.

Fast transversal filter algorithms and RLS lattice algorithms on theother hand only require a number of mathematical operation per clockcycle that is proportional to N (termed ‘fast’ algorithms).Unfortunately, fast transversal filter algorithms are virtually unusablefor large problems, due their numerical instability, i.e. sensitivity tofinite precision numerical effects arising from inaccuracies androunding errors in calculations. Moreover, although it is possible forRLS lattice algorithms to be numerically stable, they do not produce themost desirable parameters (the filter coefficients) directly.

In summary, RLS algorithms have optimal performance to the extent thatthey calculate the optimal set of filter coefficients (or relatedquantities) at all times. For a single channel problem of order N, i.e.having a number N of quantities to be determined, RLS algorithms fallinto three categories:

-   -   O(N²) algorithms: computationally expensive—prohibitively so for        large values of N;    -   O(N) LS Lattice algorithms: computationally inexpensive (fast)        but transversal filter weights are difficult to extract;    -   O(N) Fast Transversal Filter algorithms: computationally        inexpensive (fast) but numerically unstable;        here “O” indicates the order of magnitude (N or N²) of the        number of calculations to be performed per clock cycle as being        that in the associated parentheses in each case.

Regardless of the field of application, most adaptive filteringapplications demand the use of an algorithm with as high a performancebut as low a complexity as possible. However, it is a prerequisite thatthe algorithm be numerically stable at least for all practical intents.Furthermore, many adaptive filtering applications require the actualfilter weights themselves, not some other related parameters. In theprior art there is no RLS algorithm for an adaptive filter with theseproperties when the input to the filter consists of, in part or whole,delayed versions of one or more signals.

It is an object of this invention to provide an alternative form ofadaptive filter.

The present invention provides an adaptive filter characterised in thatit is arranged to update filter weights by means of a gain vectorderived from interpolation residuals of a sequence of signal samplesapplied to the filter.

In this connection, a ‘Kalman’ gain vector is known in the prior art forthe purposes of updating adaptive filter weights or coefficients,although not its derivation from interpolation residuals. Aninterpolation residual is one kind of prediction residual: a predictionresidual is the difference between a directly obtained data element anda prediction of that element extrapolated from other data associatedwith it. An interpolation residual is obtained using data both precedingand following the data element. There are also forward and backwardprediction residuals obtained using forward and backward extrapolationrespectively from a directly obtained element.

The invention provides a number of advantages: it is an alternative formof adaptive filter; unlike prior art filters based on RLS latticealgorithms, the required filter weights are derived directly, not someother related parameters, and the invention can be implemented with anRLS algorithm or an algorithm with reduced computational load based on asimplifying assumption regarding the characteristics of the signalsamples. There are also good theoretical reasons indicating that theinvention is expected to provide results that are more numericallyrobust than alternative approaches.

In a preferred embodiment, the present invention also provides anadaptive filter characterised in that it is arranged to:

-   -   a) process an input sequence of signal samples to derive        prediction residuals;    -   b) convert prediction residuals to interpolation residuals,    -   c) derive elements of a gain vector from the interpolation        residuals; and    -   d) combine the gain vector with input and reference signals and        update the filter coefficients or weights as required to provide        adaptive filtering.

The gain vector may be a Kalman gain vector. The prediction residualsmay be least squares prediction residuals, and these may be obtained byprocessing a sequence of signal samples using a recursive least squareslattice algorithm.

Prediction residuals may be converted to interpolation residualscorresponding to gain vector elements by an iterative approach in whicheach iteration changes an index (to be defined later) of a residual orof an intermediate quantity derived therefrom. The iterative approachmay be a divide and conquer approach implemented by treating predictionresiduals as interpolation residuals, iteration being arranged toproceed until the index is changed appropriately to convert theprediction residual to an interpolation residual providing an element ofthe gain vector in un-normalised form.

The iterative approach may be implemented by treating predictionresiduals as interpolation residuals with zero values for one of twoindices: this corresponds to an absence of succeeding or preceding timeseries signal samples. The iterations are arranged to proceed until thezero index in each case is changed sufficiently to convert the forwardor backward residual to an interpolation residual which is also anelement of the gain vector in un-normalised form. It may also involvetreating as an intermediate result an iteration in a sequence thereofleading to an element of the un-normalised gain vector, iteration beingarranged to proceed until an index of the intermediate result is changedsufficiently to convert such result to an interpolation residualcorresponding to an element of the gain vector.

For an adaptive filter of order N where N is equal to 2^(x) and x is apositive integer, iteration in a sequence to generate an un-normalisedelement of the gain vector may begin with use of one of two residualsimmediately adjacent and on opposite sides of a halfway point, theresidual of use being that having a relatively higher value of the indexnot to be altered in the iteration sequence, and the relevant halfwaypoint being halfway between two quantities:

-   -   a) one of which is an interpolation residual and the other a        member of the relevant time series for which the gain vector is        generated, or    -   b) one of which is an interpolation residual and the other a        starting point for an earlier iteration, or    -   c) which are respectively starting and end points for an earlier        iteration.

For an adaptive filter of order N where N is not a power of two, theiterative approach may still be used but will involve treating thefilter weight vector as a combination of weight vectors of non-equalorders e.g. if N is equal to a sum of integers each of which is a powerof two, the filter can be treated as a combination of filters each oforder a respective power of two.

Iteration in a sequence to generate an un-normalised element of a gainvector may begin with use of one of two residuals immediately adjacentand on opposite sides of a halfway point, the residual of use being thathaving a relatively higher value of the index not to be altered in theiteration sequence, and the relevant halfway point being halfway betweentwo quantities:

-   -   a) one of which is an interpolation residual and the other a        member of the relevant time series for which the gain vector is        generated, or    -   b) one of which is an interpolation residual and the other a        starting point for an earlier iteration, or    -   c) which are respectively starting and end points for an earlier        iteration.

The filter of the invention may arranged to convert prediction residualsto interpolation residuals corresponding to elements of a gain vector bya QR decomposition approach in which equations of the form:

-   -   a) ζ(i)=ψ(i)+k(i)ξ(i) are solved for ζ(i) and k(i) given ψ(i)        and ξ(i) subject to a constraint that in so doing there is        minimisation of a sum of squares of ψ(j)+k(i)ξ(j) obtained for        different sample indexes j; and    -   b) ψ(i)=ζ(i)−k(i)ξ(i) are solved for ψ(i) and k(i) given ζ(i)        and ξ(i) subject to a constraint that in so doing the value of        k(i) that is obtained is substantially that which would be        obtained in solving ζ(i)=ψ(i)+k(i)ξ(i) for ζ(i) and k(i).

QR decomposition may be arranged to employ square root free equivalentsof sine and cosine rotation parameters.

In an alternative aspect, the present invention provides a method foradaptive filtering characterised in that it includes updating filterweights with a gain vector derived from interpolation residuals of asequence of signal samples provided as data for filtering.

In a preferred embodiment, the method of the invention includes:

-   -   a) processing an input sequence of signal samples to derive        prediction residuals;    -   b) converting prediction residuals to interpolation residuals;    -   c) deriving elements of a gain vector from the interpolation        residuals; and    -   d) combining the gain vector with input and reference signals        and update the filter coefficients or weights as required to        provide adaptive filtering.

The prediction residuals may be least squares prediction residualsobtained by processing a sequence of signal samples using a recursiveleast squares lattice algorithm.

Prediction residuals may be converted to interpolation residualscorresponding to gain vector elements by an iterative approach in whicheach iteration changes an index (to be defined later) of a residual orof an intermediate quantity derived therefrom. The iterative approachmay be a divide and conquer approach which treats prediction residualsas interpolation residuals and changes the index appropriately toconvert the prediction residual to an interpolation residual providingan element of the gain vector in un-normalised form. It may treatprediction residuals as interpolation residuals with zero values for oneof two indices: this corresponds to absence of succeeding or precedingtime series signal samples, and changes the zero index in each casesufficiently to convert the forward or backward residual to aninterpolation residual which is also an element of the gain vector inun-normalised form

The iterative approach may also treat as an intermediate result aniteration in a sequence thereof leading to an element of the gain vectorand changes an index of the intermediate result sufficiently to convertsuch result to an interpolation residual corresponding to an element ofthe gain vector.

The filtering method may implement filtering of order N where N is equalto 2^(x) and x is a positive integer, characterised in that it includesiteration in a sequence to generate an un-normalised element of the gainvector, the iteration beginning with use of one of two residualsimmediately adjacent and on opposite sides of a halfway point, theresidual of use being that having a relatively higher value of an indexnot to be altered in the iteration sequence, and the relevant halfwaypoint being halfway between two quantities:

-   -   a) one of which is an interpolation residual and the other a        member of the relevant time series for which the gain vector is        generated, or    -   b) one of which is an interpolation residual and the other a        starting point for an earlier iteration, or    -   c) which are respectively starting and end points for an earlier        iteration.

The filtering method may implement filtering of order N where N is not apower of two, characterised in that the iterative approach involvestreating the filter as a combination of weight vectors each of order arespective power of two. It may include iteration in a sequence togenerate an un-normalised element of the gain vector, the iterationbeginning with use of one of two residuals immediately adjacent and onopposite sides of a halfway point, the residual of use being that havinga relatively higher value of an index not to be altered in the iterationsequence, and the relevant halfway point being halfway between twoquantities:

-   -   a) one of which is an interpolation residual and the other a        member of the relevant time series for which the gain vector is        generated, or    -   b) one of which is an interpolation residual and the other a        starting point for an earlier iteration, or    -   c) which are respectively starting and end points for an earlier        iteration.

Prediction residuals may be converted to interpolation residualscorresponding to elements of a gain vector by a QR decompositionapproach in which equations of the form:

-   -   a) ζ(i)=ψ(i)+k(i)ξ(i) are solved for ζ(i) and k(i) given ψ(i)        and ξ(i) subject to a constraint that in so doing there is        minimisation of a sum of squares of ψ(j)+k(i)ξ(j) obtained for        different sample indexes j; and    -   b) ψ(i)=ζ(i)−k(i)ξ(i) are solved for ψ(i) and k(i) given ζ(i)        and ξ(i) subject to a constraint that in so doing the value of        k(i) that is obtained is substantially that which would be        obtained in solving ζ(i)=ψ(i)+k(i)ξ(i) for ζ(i) and k(i).

QR decomposition may employ square root free equivalents of sine andcosine rotation parameters.

In a further aspect, the present invention provides a computer programfor implementing an adaptive filter characterised in that it is arrangedto generate updated filter weights by means of a gain vector derivedfrom interpolation residuals of a sequence of signal samples provided asdata for filtering.

In a preferred embodiment, the present invention provides a computerprogram for implementing an adaptive filter characterised in that it isarranged to:

-   -   a) process an input sequence of signal samples to derive        prediction residuals;    -   b) convert prediction residuals to interpolation residuals;    -   c) derive elements of a gain vector from the interpolation        residuals; and    -   d) combine the gain vector with input and reference signals and        update the filter coefficients or weights as required to provide        adaptive filtering.

The computer program may be arranged to convert prediction residuals tointerpolation residuals corresponding to gain vector elements by aniterative approach in which each iteration changes an index, to bedefined later, of a residual or of an intermediate quantity derivedtherefrom. The iterative approach may be a divide and conquer approachimplemented by treating prediction residuals as interpolation residuals,and wherein iteration is arranged to proceed until the index is changedappropriately to convert the prediction residual to an interpolationresidual providing an element of the gain vector in un-normalised form.It may be implemented by treating the prediction residuals asinterpolation residuals with zero values for one of two indices. Thiscorresponds to absence of succeeding or preceding time series signalsamples, and wherein iteration is arranged to proceed until the zeroindex in each case is changed sufficiently to convert the forward orbackward residual to an interpolation residual which is also an elementof the gain vector in un-normalised form.

The iterative approach may also involve treating as an intermediateresult an iteration in a sequence thereof leading to an element of thegain vector, wherein iteration is also arranged to proceed until anindex of the intermediate result is changed sufficiently to convert suchresult to an interpolation residual corresponding to an element of thegain vector.

The computer program may implement a adaptive filter of order N where Nis equal to 2^(x) and x is a positive integer, characterised in thatiteration in a sequence to generate an un-normalised element of the gainvector begins with use of one of two residuals immediately adjacent andon opposite sides of a halfway point, the residual of use being thathaving a relatively higher value of an index not to be altered in theiteration sequence, and the relevant halfway point being halfway betweentwo quantities:

-   -   a) one of which is an interpolation residual and the other a        member of the relevant time series for which the gain vector is        generated, or    -   b) one of which is an interpolation residual and the other a        starting point for an earlier iteration, or    -   c) which are respectively starting and end points for an earlier        iteration.

The computer program may implement a adaptive filter of order N where Nis not a power of two, characterised in that the iterative approachinvolves treating the weight vector as a combination of weight vectorseach of order a respective power of two.

Iteration in a sequence to generate an un-normalised element of the gainvector may begin with use of one of two residuals immediately adjacentand on opposite sides of a halfway point, the residual of use being thathaving a relatively higher value of an index not to be altered in theiteration sequence, and the relevant halfway point being halfway betweentwo quantities:

-   -   a) one of which is an interpolation residual and the other a        member of the relevant time series for which the gain vector is        generated, or    -   b) one of which is an interpolation residual and the other a        starting point for an earlier iteration, or    -   c) which are respectively starting and end points for an earlier        iteration.

The computer program may be arranged to convert prediction residuals tointerpolation residuals corresponding to elements of a gain vector by aQR decomposition approach in which equations of the form:

-   -   a) ζ(i)=ψ(i)+k(i)ξ(i) are solved for ζ(i) and k(i) given ψ(i)        and ξ(i) subject to a constraint that in so doing there is        minimisation of a sum of squares of ψ(j)+k(i)ξ(j) obtained for        different sample indexes j; and    -   b) ψ(i)=ζ(i)−k(i)ξ(i) are solved for ψ(i) and k(i) given ζ(i)        and ξ(i) subject to a constraint that in so doing the value of        k(i) that is obtained is substantially that which would be        obtained in solving ζ(i)=ψ(i)+k(i)ξ(i) for ζ(i) ad ζ(i).

QR decomposition may be arranged to employ square root free equivalentsof sine and cosine rotation parameters.

In order that the invention might be more fully understood, anembodiment thereof will now be described, by way of example only, withreference to the accompanying drawings, in which:

FIG. 1 is a block diagram of an adaptive filter of the invention;

FIG. 2 shows four successive stages of a recursive least squares latticealgorithm indicated in FIG. 1;

FIGS. 3 and 4 illustrate conversion of a first interpolation residualinto a second such residual with a relative change of unity in an index;

FIGS. 5 to 7 illustrate a “divide and conquer” approach to reducingcomputational load in a filter of the invention; and

FIG. 8 illustrates weight updating in a hardware version of theinvention.

In those prior art RLS algorithms which are not Lattice algorithms, whatis referred to as the ‘Kalman gain’ vector k^(N)(t) is used to update Nadaptive filter coefficients on receipt of each of successive inputsignals forming a time series, i.e. a series of data values representingthe variation of an input signal x(t) as a function of time t. Here t isin units of an interval between successive digital signal samples (clockcycles), and has integer values 1, 2 etc.

Designating the vector of adaptive filter coefficients at times t andt−1 as w_(N)(t) and w_(N)(t−1), where N is the number of filtercoefficients (the order of the filter), then an operation to updatefilter coefficients adaptively in response to input of a new signalsample is defined by:w _(N)(t)=w _(N)(t−1)+e _(N)(t)k ^(N)(t)  (3)where:e _(N)(t)=y(t)−w _(N) ^(T)(t−1)x _(N)(t)  (4)and is known as the a priori error, x_(N)(t) is a vector with elementsconsisting of input signal values from x(t) to x(t−N+1) as discussedabove, superscript index T indicates a matrix transpose and y(t) is themost recent value of a reference signal detected for use in adaptivefiltration as mentioned earlier in the example of a hands-free telephonesystem: x_(N)(t) and y(t) are therefore known quantities. The Kalmangain vector k^(N)(t) is conventionally defined as:

$\begin{matrix}{{{\underset{\_}{k}}^{N}(t)} = {{M_{XX}^{- 1}(t)}{{\underset{\_}{x}}_{N}(t)}}} & (5)\end{matrix}$where M_(xx) (t) is a covariance matrix of input signals given by:

$\begin{matrix}{{{M_{XX}(t)} = {\sum\limits_{i = 0}^{L_{t} - 1}{\beta_{1}^{2i}{{\underset{\_}{x}}_{N}\left( {t - i} \right)}{{\underset{\_}{x}}_{N}^{T}\left( {t - i} \right)}}}},} & (6)\end{matrix}$x_(N)(t−i) is a N-dimensional (column) vector with elements consistingof input signal values from x(t−i) to x(t−i−N+1) as discussed above, andx_(N) ^(T)(t−i) is its transpose (row vector). The parameter L₁determines the number of samples that are included in the summation.Sometimes L₁ is chosen to be a fixed number and sometimes it is arrangedto vary with time: e.g. L₁=t+1 is common. The Equation (6) summationcould also start from a non-zero value of i if it were decided todiscard the most recent data values. The term β₁ is what is referred toas a “forget factor”: it has a value in the range 0<β₁≦1, normally0.99<β₁<1.0, and it decrements data elements so that the most recentdata element x(t) is undecremented, that immediately preceding it isdecremented once, the next twice and so on with the nth preceding itdecremented n times. The effect of this is to bias the Equation (6)summation in favour of more recent data: i.e. it reduces the value of adata element each time the latter is used so that older data hasprogressively less influence on the solution compared to more recentdata. Use of forget factors in signal processing is known, see e.g. U.S.Pat No. 4,727,503 to McWhirter.

Most RLS algorithms that use the Kalman gain vector calculate it usingthe above formulation or something related.

However, it has been discovered in accordance with the invention thatthe ith element k_(i) ^(N)(t) of the Nth order Kalman gain vectork^(N)(t) for a signal at time t is also given by:

$\begin{matrix}{{k_{i}^{N}(t)} = \frac{ɛ_{{N - i},{i - 1}}\left( {t - i + 1} \right)}{E_{{N - i},{i - 1}}\left( {t - i + 1} \right)}} & (7)\end{matrix}$

where ε_(N−i,i−1)(t−i+1) is what is referred to as an a posteriori leastsquares interpolation residual whose definition and derivation isdescribed later. E_(N−i,i−1)(t−i+1) is a normalisation factor (alsodescribed later) corresponding to the power of the interpolationresidual ε_(N−i,i−1)(t−i+1). Equation (7) indicates that the timeparameter (i.e. t−i+1) of the terms on its right hand side depends onthe index i of the ith element k_(i) ^(N)(t), and therefore the timeparameter varies from element to element of the Kalman gain vectork^(N)(t).

Changing indices for convenience, for a signal element x(t−ƒ) in asequence comprising a time series x(1) to x(t), a least squaresinterpolation residual is designated ε_(p,ƒ)(t−ƒ): the indices p andƒindicate that p signal elements before x(t−ƒ) and f signal elementsafter it in the time series are used in the interpolation: thisinterpolation residual is derived by subtracting from the element x(t−ƒ)itself an estimated interpolation or prediction of it derived from thesignal elements both before (a total of p) and after (a total of f) itin the time series. An interpolation for the signal element x(t−ƒ) isgiven by a weighted linear combination or summation of p elements beforeit: x(t−ƒ−p) to x(t−ƒ−1) inclusive, together with a second suchcombination of the f elements after it: x(t−ƒ+1) to x(t). The leastsquares interpolation residual for the signal element x(t−ƒ) is then thedifference between the element and the interpolation,

$\begin{matrix}{{i.e.\mspace{14mu}{ɛ_{p,f}\left( {t - f} \right)}} = {{x\left( {t - f} \right)} - {\sum\limits_{i = 1}^{f}{{{\hat{w}}_{p,f,i}\left( {t - f} \right)}{x\left( {t - i + 1} \right)}}} - {\sum\limits_{i = {f + 1}}^{f + p}{{{\hat{w}}_{p,f,i}\left( {t - f} \right)}{x\left( {t - i} \right)}}}}} & (8)\end{matrix}$where the vector ŵ_(p,ƒ)(t−ƒ) has (p+f) dimensions and containsadjustable coefficients ŵ_(p,ƒ,i)(t−ƒ) (i=1 to f+p) and the summationterms are the weighted linear combinations. The reference to “leastsquares” indicates that these coefficients are those that would beproduced by any least squares optimisation procedure. In a preferredembodiment of this invention, these coefficients are never calculatedand the interpolation residuals are calculated via an alternative methodas will be described later. Nevertheless, the least squares minimisationprocedure involves determining coefficients in ŵ_(p,ƒ)(t−ƒ) in such away as to minimise a sum of a preselected number L₂ of the squares ofthe residuals determined up to and including the most recent residual,i.e. ŵ_(p,ƒ)(t−ƒ) is the value of the vector ω that minimises:

$\begin{matrix}{{J_{{p.},f}\left( {t - f} \right)} = {\sum\limits_{n = 0}^{L_{2} - 1}{\beta_{2}^{2n}\left( {{x\left( {t - n - f} \right)} - {\sum\limits_{i = 1}^{f}{\omega_{i}{x\left( {t - n - i + 1} \right)}}} - {\sum\limits_{i = {f + 1}}^{f + p}{\omega_{i}{x\left( {t - n - i} \right)}}}} \right)}^{2}}} & (9)\end{matrix}$

As L₁ above, L₂ may be chosen to be invariant or may vary with time: β₂is a “forget factor” like β₁ previously discussed.

Referring to Equation (7) once more, as has been said E_(N−i,i−1)(t−i+1)is a normalisation factor which corresponds to the power of theinterpolation residual signal ε_(N−i,i−1)(t−i+1): it can be calculatedfrom the interpolation residuals themselves. There are many ways inwhich this can be done such as by forming a sum of a preselected numberL₃ of the product of the a posteriori residuals and what are referred toas a priori residuals, which are defined below, and are determined up toincluding the most recent time instant, i.e.:

$\begin{matrix}{{E_{{N - i},{i - 1}}\left( {t - i + 1} \right)} = {\sum\limits_{n = 0}^{L_{3} - 1}{\beta_{3}^{2n}{e_{{N - i},{i - 1}}\left( {t - n - i + 1} \right)}{ɛ_{{N - i},{i - 1}}\left( {t - n - i + 1} \right)}}}} & (10)\end{matrix}$

The term e_(N−i,i−1)(t−i+1) is what is referred to as an a priori leastsquares interpolation and is calculated in a manner similar to the aposteriori residual (Equation (8)) except that the weight vector used isthat calculated at the previous time instant i.e.

$\begin{matrix}{{e_{p,f}\left( {t - f} \right)} = {{x\left( {t - f} \right)} - {\sum\limits_{i = 1}^{f}{{{\hat{w}}_{p,f,i}\left( {t - f - 1} \right)}{x\left( {t - i + 1} \right)}}} - {\sum\limits_{i = {f + 1}}^{f + p}{{{\hat{w}}_{p,f,i}\left( {t - f - 1} \right)}{x\left( {t - i} \right)}}}}} & (11)\end{matrix}$

An a priori residual is related to an a posteriori equivalent by aso-called conversion factor δ_(p,ƒ)(t−ƒ) (calculation of which will bedescribed later) as follows:ε_(p,ƒ)(t−ƒ)=δ_(p,ƒ)(t−ƒ)e _(p,ƒ)(t−ƒ)  (12)

Equations (7) to (11) demonstrate the discovery in accordance with theinvention that the gain vector (in the present example the Kalman gainvector) required to update filter coefficients can be generated frominterpolation residuals. In this regard Equation (7) shows that eachelement of the Kalman gain vector is a ratio of a respectiveinterpolation residual to the normalisation coefficient given byEquation (10), and Equations (8) and (9) yield the residuals themselvesfor derivation of the normalisation coefficient and Kalman gain vector.

In the present example all forget factors β₁ etc. are made equal: inthis regard the value chosen depends on the nature of the signals beingfiltered. The associated number of summation terms is such thatL₁=L₂=L₄=L₅=t+1, i.e. all available terms are included: NB L₄ and L₅ aredefined in later equations.

In an adaptive filter in accordance with the invention therefore, a gainvector is determined from interpolation residuals such as for examplethat given by:ε_(p,ƒ)(t−ƒ)=x(t−ƒ)−ŵ _(p,ƒ) ^(T)(t−ƒ){circumflex over (x)}_(p+ƒ)(t,t−ƒ)  (13)where x(t−f) is the (t−f)th data element of the time series, ŵ_(p,ƒ)^(T)(t−f){circumflex over (x)}_(p+ƒ)(t,t−ƒ) represents an estimate ofx(t−ƒ) obtained by interpolation from both earlier and later elements(subject to availability) of the time series, ŵ_(p,ƒ) ^(T)(t−ƒ) is aninterpolation coefficient vector and as would be obtained in a leastsquares minimisation procedure. As mentioned earlier, the interpolationresiduals may obtained without calculating ŵ_(p,ƒ) ^(T)(t−ƒ) asdescribed later. The quantity {circumflex over (x)}_(p+ƒ)(t,t−ƒ) is avector consisting of p+f elements of the time series from x(t−ƒ−p) tox(t) inclusive except for the omission of x(t−ƒ).

Referring to FIG. 1, stages for calculating an updated weight vector foran adaptive filter in response to a new input signal are shown inoutline. These will be described briefly initially and more detail willbe given later. A computer indicated by a box 10 is programmed to carryout four generalised processing stages 12 to 18. Of the latter, stage 12comprises processing an input signal using an RLS lattice algorithm andobtaining forward and backward least squares prediction residuals. Asindicated earlier, a prediction residual is the difference between adirectly obtained data element and a prediction of that elementextrapolated from other data associated with it; a forward predictionresidual is obtained using forward extrapolation of data elementspreceding a directly obtained element, and the backward equivalent isobtained by using backward extrapolation of data following that value.Forward and backward prediction residuals are naturally generated by anRLS lattice algorithm in a fast, numerically stable manner.

In stage 14, the forward and backward prediction residuals are used togenerate interpolation residuals as appropriate to provide elements ofan un-normalised Kalman gain vector with elements as indicated in thenumerator of the right hand side of Equation (7). In the presentexample, a priori interpolation residuals are calculated together withconversion factors to convert them to a posteriori interpolationresiduals using Equation (12). This approach is taken as it makesinitialisation of various stored parameters easier than if a posterioriresiduals are used.

In stage 16, to produce the normalised Kalman gain vector, each aposteriori interpolation residual is normalised by division byE_(N−i,i−1)(t−i+1) the normalisation factor: this factor is calculatedas the weighted sum of products, each product being that between apriori and a posteriori interpolation residuals for a respective timeinstant and the summation is over such products for all time instantsfor which signal samples have been obtained up to and including a mostrecent time instant as in Equation (10).

In stage 18, as indicated in Equations (3) and (4), the Kalman gainvector derived from input signal elements up to and including x(t) iscombined with x(t) itself and a reference signal y(t); this produces anupdated version of the filter coefficients or weights which havetherefore been modified in response to both input and reference signalelements x(t) and y(t). To allow for processing delay in deriving theKalman gain vector in stages 12 to 16, to achieve simultaneity at inputto stage 18, the signal elements x(t) and y(t) are stored, or, ifnecessary in a real time system, delayed.

The RLS lattice algorithm is implemented in stage 12 as follows. A LeastSquares (LS) lattice algorithm is an efficient algorithm for calculatingleast squares forward and backwards prediction residuals of a sequenceof signals such as a time series: a time series is a series of datavalues representing the variation of a signal x(t) as a function oftime.

An Nth order adaptive filter corresponds to a weight vector w_(N) havingN coefficients—the filter coefficients. Linear prediction residuals areproduced as a first step in obtaining these coefficients and are oforder 1 to N−1 inclusive. Residuals may be a posteriori or a priori,i.e. they may be produced using either a most recently determined weightvector or such a vector determined immediately previously. The nth ordera posteriori LS forward prediction residual ε_(n) ^(F)(t) for a timeseries of signals with a value x(t) at time t is defined as:ε_(n) ^(F)(t)=x(t)−a _(n) ^(T)(t)x _(n)(t−1)  (14)where x_(n)(t−1) is a data vector having elements x(t−1) to x(t−n)inclusive, and a_(n)(t) is the forward prediction coefficient vectorwhich is again determined in a least squares optimisation proceduredescribed later: this vector is determined in such a way as to minimisea weighted sum J_(n) ^(F)(t) of a preselected number (L₄) of squares ofresiduals determined up to and including a most recent residual, i.e.a_(n)(t) is the value of the vector ω that minimises:

$\begin{matrix}{{J_{n}^{F}(t)} = {\sum\limits_{m = 0}^{L_{4} - 1}{\beta_{4}^{2m}\left( {{x\left( {t - m} \right)} - {{\underset{\_}{\omega}}^{T}{{\underset{\_}{x}}_{n}\left( {t - m - 1} \right)}}} \right)}^{2}}} & (15)\end{matrix}$where β₄ is a forget factor as described earlier.

Similarly, the nth order a posteriori LS backward prediction residualε_(n) ^(B)(t) for a time series of signals with a value x(t) at time tis defined as:ε_(n) ^(B)(t)=x(t−n)−c _(n) ^(T)(t)x _(n)(t)  (16)where c_(n)(t) is a coefficient vector for backward predictiondetermined as before by a least squares optimisation procedure whichminimises a weighted sum of a number L₅ of squares of residualsdetermined up to and including the most recent residual, i.e. c_(n)(t)is the value of the vector ω that minimises:

$\begin{matrix}{{J_{n}^{B}(t)} = {\sum\limits_{m = 0}^{L_{5} - 1}{\beta_{5}^{2m}\left( {{x\left( {t - m - n} \right)} - {{\underset{\_}{\omega}}^{T}{{\underset{\_}{x}}_{n}\left( {t - m} \right)}}} \right)}^{2}}} & (17)\end{matrix}$where β₅ is a forget factor.

The corresponding a priori forward and backward LS prediction residualse_(n) ^(F)(t) and e_(n) ^(B)(t) respectively are defined similarly asfollows:e _(n) ^(F)(t)=x(t)−a _(n) ^(T)(t−1)x _(n)(t−1)  (18)e _(n) ^(B)(t)=x(t−n)−c _(n) ^(T)(t−1)x _(n)(t)  (19)

Conversion factors δ_(n) ^(F)(t) and δ_(n) ^(B)(t) which relate forwardand backward a priori residuals to respective a posteriori equivalentsare defined as:

$\begin{matrix}{{\delta_{n}^{F}(t)} = \frac{ɛ_{n}^{F}(t)}{e_{n}^{F}(t)}} & (20) \\{{\delta_{n}^{B}(t)} = \frac{ɛ_{n}^{B}(t)}{e_{n}^{B}(t)}} & (21)\end{matrix}$

An Nth order Least Squares Lattice (LSL) algorithm generates a set offorward and backward a posteriori prediction residuals, N of each, i.e.{ε₁ ^(ƒ)(t), . . . ε_(N) ^(ƒ)(t), ε₁ ^(b)(t), . . . ε_(N) ^(b)(t)}. HereN is used in a general sense, and is not necessarily the same value as Nused elsewhere herein. The particular LSL algorithm used in the presentexample is a Square Root Free QR Decomposition-based Recursive LeastSquares Lattice (RLSL) Algorithm. Internally it calculates not the aposteriori prediction residuals directly but instead it calculates the apriori prediction residuals and the corresponding conversion factors.The algorithm is related to recursive square root free QR decompositionas described in U.S. Pat. No. 4,727,503 to McWhirter, which describesboundary and internal cell functions for a systolic array carrying outthis function using inter alia a square root free equivalent of Givensrotations. See also I K Proudler, J G McWhirter and T J Shepherd,“Computationally Efficient, QR Decomposition Approach to Least SquaresAdaptive Filtering”, IEE Proceedings, vol.138, Pt. F, no.4, August 1991,pp.341-353.

QR decomposition is a well-known procedure which is used in solving setsof linear equations by determining a matrix Q of rotation parameterswhich will transform a data matrix X into an upper right triangularmatrix R (i.e. with all sub-diagonal elements zero) by rotation of itsco-ordinate system: such an R matrix has a last row corresponding to oneequation with a single unknown and allowing the equations to be solvedby back substitution. Rotation parameters are sines and cosines orsquare root free equivalents of these.

FIG. 2 is a diagram illustrating the first four successive steps of theRLSL algorithm, which are concatenated stages: the ith stage (i=1, 2, 3,or 4) includes two internal cells (squares) Ia_(i) and Ib_(i), twoboundary cells Ba_(i) and Bb_(i) and two delay cells Da_(i) and Db_(i).The expressions “internal cell” and “boundary cell” are conventional inQR processing and originally distinguished above-diagonal cells fromon-diagonal cells in a triangular array: they are now used todistinguish cells which apply rotation parameters from those thatevaluate them.

As described in U.S. Pat. No. 4,727,503, boundary cells evaluaterotation parameters from input data and cumulatively multiplycosine-like rotation parameters; internal cells apply rotationparameters to input data. Each of these cells also recomputes and storesa quantity in response to each input, and their processing functionswill be described later in more detail. Each delay cell Da_(i) andDb_(i) store an input signal for a single time interval and outputs animmediately preceding input signal: e.g. when Da₁ receives a new inputsignal such as data element x(t) it outputs the preceding input signalx(t−1) and stores the new one until the next input signal x(t+1) isreceived. The ith stage provides both forward and backward predictionresiduals of order i using input signals x(t) when i=1 and when i is 2or more using residuals corresponding to outputs from stage i−1.

At time instant t, a data element x(t) is input to the first stage ofthe RLSL algorithm: it passes to the first internal cell Ia₁, firstdelay cell Da₁ and second boundary cell Bb₁. The first delay cell Da₁outputs x(t−1) to the second internal cell Ib₁ and first boundary cellBa₁. The first boundary cell Ba₁ and second boundary cell Bb₁ alsoreceive inputs of 1, the latter from the second delay cell Db₁(unnecessary but included for conformity between stages). The sequenceof operations then flows from left to right.

The cells of the first RLSL algorithm stage generate quantities and passthem as follows:

-   -   first boundary cell Ba₁: uses x(t−1) to update a quantity stored        within it, and it calculates square root free rotation        parameters from the stored quantity and passes them to the first        internal cell Ia₁; it calculates the conversion factor δ₁        ^(F)(t) (as defined in Equation (20) with n=1 relating a priori        and a posteriori forward prediction residuals) and passes it as        an input to the second stage first boundary cell Ba₂;    -   second boundary cell Bb₁: uses x(t) to update a stored quantity,        and it calculates square root free rotation parameters from the        stored quantity and passes them to second internal cell Ib₁; it        calculates the conversion factor δ₁ ^(B)(t) relating a priori        and a posteriori backward prediction residuals and passes it an        input to the second stage second delay cell Db₂, which responds        by outputting δ₁ ^(B)(t−1) (received previously) to the second        stage second boundary cell Bb₂;    -   first internal cell Ia₁: uses x(t) to update a stored quantity;        it then calculates the a priori forward prediction residual e₁        ^(F)(t) (i.e. n=1 in Equation (18)) and passes it to the second        stage first internal and second boundary cells Ia₂ and Bb₂;    -   second internal cell Ib₁: uses x(t−1) to update a stored        quantity; it then calculates the a priori backward prediction        residual e₁ ^(B)(t) (i.e. n=1 in Equation (19)) passes it to the        second stage first delay cell Da₂, which responds by outputting        e₁ ^(B)(t−1) (received previously) to the second stage second        internal cell Ib₂ and first boundary cell Ba₂;

The values e₁

(t), δ₁ ^(F)(t), e₁ ^(B)(t) and δ₁ ^(B)(t) are evaluated and passed onin this way by the first stage solving the first order (n=1) forward andbackward linear prediction problems: the second stage solves the secondorder (n=2) equivalent of this, i.e. the second stage internal andboundary cells Ia₂, Ba₂, Ib₂ and Bb₂ calculate results equivalent tothose of the first stage but with the subscript index 2, i.e. e₂^(F)(t), δ₂ ^(F)(t), e₂ ^(B)(t) and δ₂ ^(B)(t). Similarly, the third andfourth stages subsequently calculate third and fourth order equivalentswith index n=3 and n=4. Nth order prediction residuals require N stagesfour of which are shown in FIG. 2.

As has been said, the terms e_(n) ^(F)(t) and e_(n) ^(B)(t) generated asdescribed above are a priori prediction residuals: they can be convertedto a posteriori residuals ε_(n) ^(F)(t) and ε_(n) ^(B)(t) using termsδ_(n) ^(F)(t) and δ_(n) ^(B)(t) also generated by the RLSL algorithm andsubstituting into Equations (20) and (21) above as follows:ε_(n) ^(F)(t)=δ_(n) ^(F)(t)e _(n) ^(F)(t)  (22)ε_(n) ^(F)(t)=δ_(n) ^(B)(t)e _(n) ^(B)(t).  (23)

Once the a posteriori prediction residuals have been calculated usingEquations (22) and (23), they may be used as hereinafter described tocalculate interpolation residuals that make up the un-normalised Kalmangain vector, which is the numerator of the right hand side of Equation(7). However it is advantageous for initialisation purposes as notedabove to calculate the interpolation residuals in a priori form togetherwith corresponding conversion factors. Hence in the present exampleEquations (22) and (23) are not implemented.

FIGS. 3 and 4 show how an a priori interpolation residual such ase_(p,f)(t−f) (and the corresponding conversion factor) can be used,together with certain prediction residuals, to generate two otherinterpolation residuals: relative to the indices and time instant of theoriginal interpolation residual, one of the two residuals generated fromit has a p index increased by 1 and the other has an f index increasedlikewise and a time instant one sample interval earlier, i.e.e_(p+1,f)(t−f) and e_(p,f+1)(t−f−1).

In FIG. 3, the a priori interpolation residual e_(p,f)(t−f) and itsassociated conversion factor δ_(p,f)(t−f) are converted into the apriori interpolation residual e_(p+1,f)(t−f) and its associatedconversion factor δ_(p+1,f)(t−f). The process involves two internalcells (squares) I₁ and I₂ and two boundary cells B₁ and B₂. Boundarycells evaluate rotation parameters and, in the case of B₂, cumulativelymultiply cosine-like rotation parameters; internal cells apply rotationparameters to data. Each of these cells also recomputes and stores aquantity in response to each input, and their processing functions willbe described next in more detail.

-   -   Boundary cell B₁: uses the a priori interpolation residual        e_(p,f)(t−ƒ) and conversion factor δ_(p,ƒ)(t−ƒ) to update a        quantity stored within it, and it calculates square root free        rotation parameters from the stored quantity and passes them to        first internal cell I₁; it passes the conversion factor        δ_(p,ƒ)(t−ƒ), relating a priori and a posteriori residuals, at        its input to the boundary cell B₂;    -   internal cell I₁: uses the a priori backward prediction residual        e_(p+f+1) ^(B)(t) and rotation parameters from boundary cell B₁        to update a stored quantity; it then calculates the quantity        ê_(p+ƒ) ^(B)(t,t−ƒ) and passes it to the boundary cell B₂;    -   boundary cell B₂: uses ê_(p+ƒ) ^(B)(t,t−ƒ) e_(p,ƒ)(t−ƒ) and        conversion factor δ_(p,ƒ)(t−ƒ) to update a stored quantity, and        it calculates square root free rotation parameters from the        stored quantity and passes them to internal cell I₂; it        calculates the conversion factor δ_(p+1,ƒ)(t−ƒ) relating a        priori and a posteriori residuals;    -   internal cell I₂: uses e_(p,ƒ)(t−ƒ) e_(p+ƒ+1) ^(B)(t) and        rotation parameters from boundary cell B₂ to update a stored        quantity; it then calculates the a priori interpolation residual        e_(p+1,f)(t−f);

In FIG. 4, the a priori interpolation residual e_(p,f)(t−f) and itsassociated conversion factor δ_(p,f)(t−f) are converted into the apriori interpolation residual e_(p,f+1)(t−f−1) and its associatedconversion factor δ_(p,ƒ+1)(t−ƒ−1). With the exception of the two delaycells (D₁ and D₂), the underlying operations are exactly the same asdescribed in relation to FIG. 3 it is the use of different input signalsthat distinguishes the approaches described with reference to FIGS. 3and 4.

In FIG. 4, the process involves two delay cells D₁ and D₂, two internalcells (squares) I₃ and I₄ and two boundary cells B₃ and B₄. Each of thedelay cells D₁ and D₂ stores an input signal for a single time interval:when it receives a new input signal h(t) it outputs the preceding inputsignal h(t−1) and stores the new one until the next input signal h(t+1)is received. Boundary cells B₃ and B₄ evaluate rotation parameters and,in the case of B₄, cumulatively multiply cosine-like rotationparameters; internal cells apply rotation parameters to data. Each ofthese cells also recomputes and stores a quantity in response to eachinput, and their processing functions will be described later in moredetail.

-   -   Boundary cell B₃: uses the a priori interpolation residual        e_(p,ƒ)(t−ƒ−1), output from delay cell D₂, and conversion factor        δ_(p,f)(t−f−1) to update a quantity stored within it, and it        calculates square root free rotation parameters from the stored        quantity and passes them to first internal cell I₁; it passes        the conversion factor δ_(p,ƒ)(t−ƒ−1), output from delay cell D₁,        relating a priori and a posteriori residuals, at its input to        the boundary cell B₄;    -   internal cell I₃: uses the a priori forward prediction residual

e_(p + f + 1)^(F)(t)and rotation parameters from boundary cell B₃ to update a storedquantity; it then calculates the quantity

ê_(p + f)^(F)(t, t − f − 1)and passes it to the boundary cell B₄;

-   -   boundary cell B₄: uses

ê_(p + f)^(F)(t, t − f − 1)and conversion factor δ_(p,ƒ)(t−ƒ−1) to update a stored quantity, and itcalculates square root free rotation parameters from the stored quantityand passes them to internal cell I₂; it calculates the conversion factorδ_(p,ƒ+1)(t−ƒ−1) relating a priori and a posteriori residuals;

-   -   internal cell I₄: uses e_(p,ƒ)(t−ƒ−1) and rotation parameters        from boundary cell B₄ to update a stored quantity; it then        calculates the a priori interpolation residual e_(p,ƒ+1)(t−ƒ−1);

The details of the function of the cells in FIGS. 3 and 4 will now bedescribed. In IEEE Trans SP January 2000 vol 48(1) pp.70-79, “QRdecomposition based least squares lattice interpolators”, J T Yuan showsthat it is possible to use an a posteriori interpolation residualε_(p,ƒ)(t−ƒ), or a one step delayed equivalent ε_(p,ƒ)(t−ƒ−1), togenerate a posteriori interpolation residuals ε_(p+1,ƒ)(t−ƒ) andε_(p,ƒ+1)(t−ƒ−1) as set out in Equations (24) and (25) below:

$\begin{matrix}{{ɛ_{{p + 1},f}\left( {t - f} \right)} = {{ɛ_{p,f}\left( {t - f} \right)} + {{k_{{p + 1},f}(t)}{{\hat{ɛ}}_{p + f}^{B}\left( {t,{t - f}} \right)}}}} & (24) \\{{ɛ_{p,{f + 1}}\left( {t - f - 1} \right)} = {{ɛ_{p,f}\left( {t - f - 1} \right)} + {{\mu_{p,{f + 1}}(t)}{{\hat{ɛ}}_{p + f}^{F}\left( {t,{t - f - 1}} \right)}}}} & (25)\end{matrix}$where k_(p+1,ƒ)(t) and μ_(p,ƒ+1)(t) are coefficients determined by aleast squares minimisation procedure, i.e. k_(p+1,ƒ)(t) and μ_(p,ƒ+1)(t)are set equal to the value of the coefficients ω₁ and ω₂ (respectively)that minimise the weighted sum of the squares of values of the terms

ɛ_(p, f)(t − f) + ω₁ɛ̂_(p + f)^(B)(t, t − f)and

ɛ_(p, f)(t − f − 1) + ω₂ɛ̂_(p + f)^(F)(t, t − f − 1)(respectively) over a succession of evaluations at different times t.The quantities

ɛ̂_(p + f)^(B)(t, t − f)and

ɛ̂_(p + f)^(F)(t, t − f − 1)are calculated from other known quantities as will be described later.Here the circumflex symbol above ε in each case indicates that thequantities

ɛ̂_(p + f)^(B)(t, t − f)and

ɛ̂_(p + f)^(F)(t, t − f − 1)are, respectively, not the same as the backward and forward linearprediction residuals given by Equations (16) and (14);

ɛ̂_(p + f)^(B)(t, t − f)is a backward prediction residual of order (p+ƒ) at time (t) derivedwith the omission of data element x(t−ƒ), and

ɛ̂_(p + f)^(F)(t, t − f − 1)is a forward prediction residual of order (p+ƒ) at time (t) derived withthe omission of data element x(t−ƒ−1).

In the above analysis the time index (t−ƒ) has been used forconvenience, although strictly speaking it can be shown mathematicallythat the parameter f is superfluous.

There are several well known methods for solving least squaresminimisation problems. In this example such a method based on QRDecomposition (QRD) is employed: it will be described in terms ofdetermining a coefficient k(t) in a generalised equation as follows.Each of Equations (24) and (25) is of the form:ζ(t)=ψ(t)+k(t)ξ(t)  (26)

Equation (26) it is required to calculate, at time instant t, thecoefficient k(t) given input data ξ(t) and ψ(t) (a posteriori modifiedprediction and interpolation residuals respectively), and subject to theconstraint that in so doing there is minimisation of a sum of squares ofthe quantity ψ(i)+ωξ(i) calculated for certain different times i andweighted by forget factors β: i.e. k(t) is equal to the value of aparameter ω that delivers a minimum value for the expression:

$\begin{matrix}{\sum\limits_{i = 0}^{L}\;{\beta^{2\; i}\left( {{\psi\left( {t - i} \right)} + {{\omega\xi}\left( {t - i} \right)}} \right)}^{2}} & (27)\end{matrix}$

The parameter L determines the number of past samples that are includedin the summation. As mentioned earlier L may be fixed or may vary withtime, and β is a forget factor. The effect of calculating k(t) in thisway is to attenuate any component of ξ(t) which also appears in ψ(t).The term ζ(t) is an a posteriori least squares residual for the problem.

In this example the preferred method for solving the least squaresproblem given in Equations (26) and (27) involves not the a posterioriresiduals ξ(t) and ψ(t) but corresponding a priori residuals, u(t) andv(t), respectively, to which a conversion factor δ_(m)(t) is applied: itmight be anticipated that there is a different conversion factor foreach residual, but in fact in the present example it has been found thatthey are the same. In addition the preferred method does not solve theleast squares problem given in Equations (26) and (27) directly butinstead uses a recursive procedure that calculates k(t) based on the apriori residuals u(t) and v(t) and the conversion factor δ_(m)(t) atonly the current time instant and knowledge of the values of variousparameters calculated at the previous time instant and stored for lateruse. One such stored parameter is the value of the least squarescoefficient at the previous time instant i.e. k(t−1).

By an analysis related to that of S Hammarling, “A Note on Modificationsto the Givens Plane Rotation”, J. Inst. Maths. Applics., vol. 13, pp.215-218, 1974, and described in U.S. Pat. No. 4,727,503 to McWhirter, itcan be shown that k(t) can be calculated from the followingrecursions—which also yield the a priori residual associated with ζ(t)(i.e. z(t)) and the corresponding conversion factor (δ_(out)(t)):d(t)=β² d(t−1)+δ_(in)(t)|u(t)|²  (28)

$\begin{matrix}{s = \frac{{\delta_{i\; n}(t)}{u(t)}}{d(t)}} & (29) \\{{\delta_{out}(t)} = \frac{\beta^{2}{\mathbb{d}\left( {t - 1} \right)}{\delta_{i\; n}(t)}}{\mathbb{d}(t)}} & (30)\end{matrix}$z(t)=v(t)+k(t−1)u(t)  (31)k(t)=k(t−1)−sz(t)  (32)

The term d(t) is defined by Equation (28) and represents an estimate ofthe energy of the time series u(0) to u(t) inclusive but with moreweight given to more recent samples. All terms on the right-hand side ofEquation (28) are known at time t: the term d(t−1) was obtained in theimmediately preceding time instant by calculation if t=2 or more; fort=1 the corresponding initial term d(0) is given a predefined value suchas zero; the terms δ_(in)(t) and u(t) are input data. The term s is ageneralised rotation parameter of the kind used in QR decomposition: itis defined by Equation (29) in terms of input data δ_(in)(t) and u(t)together with a quantity d(t) calculated for the relevant time instantusing Equation (28). Equation (30) defines the conversion factorδ_(out)(t) for the residual z(t) in terms of known quantities: i.e.d(t−1) is known from the previous time instant and δ_(in)(t) and d(t)have been calculated for this time instant using Equation (28). TogetherEquations (28) to (30) inclusive represent the calculation that takesplace for each time instant in the boundary cells Ba₁ etc. shown in FIG.2 and the boundary cells B₂ and B₄ shown in FIGS. 3 and 4.

Equations (31) and (32) represent the calculation that takes place foreach time instant in the internal cells Ia₁ etc. in FIG. 2 and theinternal cells I₂ and I₄ shown in FIGS. 3 and 4. Equation (31) expressesz(t) in terms of two known quantities (input data) u(t) and v(t), and avalue k(t−1) evaluated earlier for data of time instant (t−1) where t=2or more; t=1 corresponds to the term k(0) which is set equal to apredefined value such as zero. Equation (32) expresses k(t) in terms ofquantities s and z(t) calculated from Equations (29) and (31) togetherwith k(t−1) from the previous time instant. With the knowledge of anystarting value eg k(0) it is therefore possible to generate a successionof subsequent values of z(t) and k(t) using Equations (28) to (32):

Referring to Equations (24) and (25) once more, the quantities

ɛ̂_(p + f)^(B)(t, t − f)and

ɛ̂_(p + f)^(F)(t, t − f − 1)are required in the foregoing calculations, and they are in factobtained from other known quantities according to:

$\begin{matrix}{{{\hat{ɛ}}_{p + f}^{B}\left( {t,{t - f}} \right)} = {{ɛ_{p + f + 1}^{B}(t)} - {{\eta_{{p + 1},f}(t)}{ɛ_{pf}\left( {t - f} \right)}}}} & (33) \\{{{\hat{ɛ}}_{p + f}^{F}\left( {t,{t - f - 1}} \right)} = {{ɛ_{p + f + 1}^{F}(t)} - {{{??}_{p,{f + 1}}(t)}{ɛ_{p,f}\left( {t - f - 1} \right)}}}} & (34)\end{matrix}$where η_(p+1,f)(t) and v_(p,f+1)(t) are coefficients determined by amodified least squares minimisation procedure which will now bedescribed: this procedure involves recursive least squares minimisation,and the two least squares problems in Equations (33) and (34) have thesame form, i.e.:ψ(t)=ζ(t)−k(t)ξ(t)  (35)where ζ(t) and ξ(t) are known input data (a posteriori prediction andinterpolation residuals respectively), and k(t) and ψ(t) are to becalculated. Although it is a modified procedure as compared to thatassociated with Expressions (26) and (27), it can be shown that thisrecursive least squares procedure calculates the same value for thecoefficient k(t) that would have been obtained by solving theconventional least squares problem described by Expressions (26) and(27) and implemented by recursion Equations (28) to (32): the latter hadψ(t) and ξ(t) as input data and yielded k(t) and ζ(t), but in thepresent case ψ(t) is not available. However it can be shown that k(t)can be derived without knowledge of ψ(t) provided k(t−1) and ζ(t) areknown. This is explained below but in terms of the corresponding apriori residuals and associated conversion factors: a priori residuals(Roman symbols) z(t), u(t) and v(t) correspond respectively to aposteriori residuals (Greek symbols) ζ(t), ξ(t) and ψ(t).

Equations (28) to (32) show that k(t) can be derived from k(t−1) withknowledge of v(t). Furthermore the recursion produces z(t) as aby-product. For the purpose of implementing the aforementioned modifiedrecursive least squares procedure, it can be shown that these equationscan be reordered in a way such that it is possible to derive a value fork(t) from k(t−1) without knowledge of v(t) provided z(t) is known whichis the case. Moreover, these equations, so reordered, produce v(t) as aby-product and are set out in Equations (36) to (40) below:d(t)=β² d(t−1)+δ_(in)(t)|u(t)|²  (36)

$\begin{matrix}{s = \frac{{\delta_{i\; n}(t)}{u(t)}}{d(t)}} & (37)\end{matrix}$δ_(out)(t)=δ_(in)(t)  (38)v(t)=z(t)−k(t−1)u(t)  (39)k(t)=k(t−1)−s z(t)  (40)

It is reiterated that Equations (36) to (40) are in terms of a prioriresiduals (Roman symbols) z(t), u(t) and v(t), which correspondrespectively to a posteriori residuals (Greek symbols) ζ(t), ξ(t) andψ(t).

The term d(t) is defined by Equation (36) and represents an estimate ofthe energy of the time series u(0) to u(t) inclusive but with moreweight given to more recent samples. All terms on the right-hand side ofEquation (36) are known at time t: for t=2 or more, the term d(t−1) isknown from the previous time instant; in the case of t=1 and d(0), d(0)is set equal to a predefined value such as zero; the terms δ_(in)(t) andu(t) are input data. The term s is a generalised rotation parameter: itis defined by Equation (37) in terms of input data δ_(in)(t) and u(t)and d(t) calculated using Equation (36). Equation (38) defines theconversion factor δ_(out)(t) for the residual v(t) as being equal to theknown input conversion factor δ_(in)(t). Together Equations (36) to (38)inclusive represent the calculation carried out in the boundary cells B₁and B₃ and described with reference to FIGS. 3 and 4.

Equations (39) and (40) express the calculation carried out in theinternal cells I₁ and I₃ shown in FIGS. 3 and 4. Equation (39) expressesv(t) in terms of known quantities, i.e. input data u(t) and z(t)together with a value k(t−1) evaluated earlier for data of time (t−1),or, in the case of k(0), set to a predefined values such as zero.Equation (40) expresses k(t) in terms of the quantities s calculatedusing Equation (37) and the input datum z(t) together with k(t−1) fromthe previous time instant. With the knowledge of any starting value egk(0) it is therefore possible to generate a succession of subsequentvalues of v(t) and k(t) for subsequent time instants using Equations(36) to (40).

Equations (24) et sequi show that an a posteriori interpolation residualsuch as ε_(p,f)(t−f) (or equivalently an a priori interpolation residuale_(p,f)(t−f) and the corresponding conversion factor δ_(p,f)(t−f)) or aone sample time interval delayed equivalent e_(p,f)(t−f−1) can be used,together with certain prediction residuals, to generate two otherinterpolation residuals: relative to the indices of the originalinterpolation residual, one of the two residuals generated from it has ap index increased by 1 and the other has an f index increased likewise,i.e. ε_(p+1,f)(t−f) and ε_(p,f+1)(t−f−1). This is shown in FIGS. 3 and 4in terms of a priori interpolation residuals and the correspondingconversion factors.

For convenience the vector {circumflex over (k)} ^(N)(t) is defined tobe the un-normalized Kalman gain vector of order N: i.e. the ith elementof {circumflex over (k)} ^(N)(t) is {circumflex over (k)}_(i)^(N)(t)=ε_(N−i,i−1)(t−i+1)—see Equation (7). There are many possibleways of using the method described with reference to FIGS. 3 and 4 togenerate the interpolation residuals which are the elements {circumflexover (k)}_(Q) ^(N)(t) (Q=1 to N) of the un-normalized Kalman gain vector{circumflex over (k)}^(N)(t) for an Nth order adaptive filter.

For economy of description, the following discussion is in terms of aposteriori residuals, whereas equivalently in the present example apriori residuals and conversion factors are in fact used. Interpolationresiduals ε_(p,0)(t) and ε_(0,f)(t−f) have a special property. Theinterpolation residual ε_(N−i,0)(t) has p=N−i and f=0: f=0 correspondsto there being no subsequent terms in the time series. This residual istherefore calculated only from preceding terms, i.e. from the same termsin the time series and in the same way as the forward predictionresidual

ɛ_(N − i)^(F)(t)see Equation (14): these two residuals are therefore equal. Moreover,since the forward prediction residual can obtained using a least squareslattice processor, so also can the interpolation residual ε_(N−i,0)(t).

As stated there are many ways that the methods described with referenceto FIGS. 3 and 4 can be utilised to generate the elements of theun-normalized Kalman gain vector {circumflex over (k)}^(N)(t). Forexample, following the principle elicited above the interpolationresidual ε_(N−i,0)(t) could be used to generate another interpolationresidual with an f index increased by 1 and corresponding to theimmediately preceding or (t−1)th time sample, namely ε_(N−i,1)(t−1);this procedure is repeated to produce ε_(N−i,2)(t−2) fromε_(N−i,1)(t−1). Iteration is repeated a total of (i−1) times untilε_(N−i,i−1)(t−i+1) is generated, i.e. the iteration stages are:ε_(N−i,0)(t)

ε_(N−i,1)(t−1)

ε_(N−i,2)(t−2)

. . .

ε_(N−i,i−1)(t−i+1)  (41)

An alternative method for generating ε_(N−i,i−1)(t−i+1) is as follows.The interpolation residual ε_(0,i−1)(t−i+1) (i.e. p=0, f=i−1) is thesame as the (i−1)th backward prediction residual for time t: ε_(i−1)^(B)(t)—see Equation (16). Following the principle elicited above theinterpolation residual ε_(0,i−1)(t−i+1) is used to generateε_(1,i−1)(t−i+1) with a p index increased by 1; this procedure isiterated to produce ε_(2,i−1)(t−i+1) from ε_(1,i−1)(t−i+1). Iteration isrepeated a total of (N−i) times until ε_(N−i,i−1)(t−i+1) is generated,i.e.:ε_(0,i−1)(t−i+1)

ε_(1,i−1)(t−i−1)

ε_(2,i−1)(t−i+1)

. . .

ε_(N−i,i−1)(t−i,i−1).  (42)

The term ε_(N−i,i−1)(t−i+1) is the ith element of the un-normalizedKalman gain vector {circumflex over (k)}_(i) ^(N)(t).

The two methods indicated by iterations (41) and (42) above require ofthe order of N² (O(N²)) computations to generate all of the requiredinterpolation residuals making up the un-normalised Kalman gain vector,where N is the order of the adaptive filter. An alternative procedurewill now be described to reduce the number of computations to O(Nlog₂N).

For convenience of description it will be assumed that N=2^(x), where Nis the order of the filter and x is a positive integer. As described inconnection with iterations (41) and (42), an interpolation residual ofthe kind ε_(p,ƒ)(t−ƒ) such as ε_(M,M−1)(t−M+1) (M=N/2) can be calculatedby iteration starting from an interpolation residual ε_(M,0)(t); becauseε_(M,0)(t) has p=M and f=0, it is derived from M preceding but 0 (no)succeeding time series data elements: it is therefore identical to the aposteriori forward prediction residual ε_(M) ^(ƒ)(t) which is obtainedfrom the same data elements by the same calculation. In consequence,ε_(M,M−1)(t−M+1) is obtainable by iteration of the type shown inEquation (41) from ε_(M) ^(ƒ)(t) itself obtained as described earlierfrom the least squares lattice processor. Furthermore, in the process ofthis iteration the residuals e_(M,ƒ)(t−ƒ) with values of f between 0 andM−1 are calculated, i.e. f=1 to M−2, namely:ε_(M,1)(t−1), ε_(M,2)(t−2), ε_(M,3)(t−3), . . . ,ε_(M,M−2)(t−M+2)  (43)

The interpolation residuals in (43) are also intermediate quantitiesthat would be calculated using the iteration shown in Equation (42) tocalculate the residuals ε_(N−i,i−1)(t−i+1) where (2≦i≦M−1). Hence if theintermediate quantities at (43) are stored for later reuse it ispossible to avoid some of the computation involved in calculating otherresiduals making up the un-normalised Kalman gain vector.

Similarly, the interpolation residual ε_(M−1,M)(t−M) is obtainable bythe iteration shown in Equation (42) from ε_(0,M)(t−M), which itself isidentical to the a posteriori backward prediction residual ε_(M)^(B)(t): this iteration also generates intermediate quantitiesε_(p,M)(t−M) with values of p from 1 to M−2.

Similar savings are also possible for calculation of other terms, i.e.ε_(N−i,i−1)(t−i+1) with values of i given by (M+2≦i≦N−1) via iterations(41).

Storing intermediate quantities whilst calculating ε_(N−3P,3P−1)(t−3P+1)from ε_(N−3P,M)(t−M) (where P=N/4) gives a reduction in computation ofresiduals ε_(N−i,i−1)(t−i+1) with values of i given by (M+2≦i≦3P−1)calculated via Equation (42). In total there are four such sets ofcomputations, i.e. ε_(N−i,i−1)(t−i+1) with values of i given by(M+2≦i≦3P−1), (3P+2≦i≦N−1), (2≦i≦P−1) and (P+2≦i≦M−1). Furthermore thesefour set of computations can then be split in to eight in a similarmanner and so on doubling repeatedly until N residuals needed for theKalman gain vector are generated.

The procedure starts by calculating two elements of the un-normalisedKalman gain vector near its centre using the iteration approaches of(41) and (42). The Mth element ε_(M,M−1)(t−M+1) is calculated viarepeated use of Equation (41) starting from the forward residuale_(M,0)(t). The (M+1)th element ε_(M−1,M)(t−M) is calculated by repeateduse of Equation (42) starting from the backward residual ε_(0,M)(t−M).It is not in fact essential to use these starting residuals but it isbelieved that they are the most economical ones to use in terms of thenumber of computations required.

FIG. 5 graphically depicts interpolation residuals ε_(p,ƒ)(t−ƒ) plottedagainst horizontal and vertical axes 50 and 52 in accordance with theirƒand p index values. Each point in the triangular space 54 represents aninterpolation residual. The horizontal position of a point on thediagram and the associated ƒindex value represent the number of futuredata samples used to generate the interpolation residual. The verticalposition and the associated p index value represent the number of pastdata samples used to generate the interpolation residual. Values on thehorizontal axis 50 are interpolation residuals with ƒ=0 to N−1 and p=0,and therefore they are also backward prediction residuals ε_(ƒ) ^(B)(t)obtained as previously described and for reasons given then. Similarly,values on the vertical axis 52 are interpolation residuals with ƒ=0 andp=0 to N−1, and therefore they are also forward prediction residualsε_(p) ^(F)(t) obtained as before.

The interpolation residuals required to be calculated are positioned ona diagonal dotted line 56 shown as line halves 56 a and 56 b—they are ashas been said the elements {circumflex over (k)}_(Q) ^(N) of theun-normalised Kalman gain vector {circumflex over (k)} ^(N)(t). Thesequence of residuals generated in an iteration to calculate tworesiduals ε_(M,M−1)(t−M +1) and ε_(M−1,M)(t−M) at 58 and 60 collectivelycentral to the diagonal 56 are shown as chain lines 62 and 64respectively.

There are now two sub-problems, identical in principle to the originalproblem, but of order M=N/2: they are indicated by respective linehalves 56 a and 56 b. The next step of the procedure is to calculate twopairs of elements, each pair collectively central to a respective linehalf 56 a or 56 b. Let N/4=M2=P. The Pth element of {circumflex over(k)} ^(N)(t), i.e. ε_(3P,P−1)(t−P+1) is calculated by iteration ofEquation (41) starting from the forward residual ε_(3P,0)(t).

The (P+1)th element of {circumflex over (k)} ^(N)(t), i.e.ε_(3P−1,P)(t−P) is calculated by iteration of Equation (42) staring fromthe interpolation residual ε_(M,P)(t−P), which itself was calculatedduring the calculation of {circumflex over (k)}_(M)^(N)=ε_(M,M−1)(t−M+1).

The 3Pth element of {circumflex over (k)}^(N)(t), i.e.ε_(P,3P−1)(t−3P+1) is calculated by iteration of Equation (41) startingfrom the residual ε_(P,M)(t−M).

The (3P+1)th element of {circumflex over (k)} ^(N)(t), i.e.ε_(P−1,3P)(t−3P) is calculated by iteration of Equation (42) startingfrom the residual ε_(0,3P)(t−3P).

The evolution of the computation to this point is illustrated in FIG. 6,in which parts equivalent to those described earlier arelike-referenced: four horizontal and vertical chain lines 66 have becomeadded to this drawing as compared to FIG. 5 indicating calculation offour additional interpolation residuals or elements of the un-normalisedKalman gain vector.

FIG. 6 illustrates that there are now four sub-problems of order P, asindicated by four diagonal quarter lines such as 56 c. This procedure ofsplitting each sub-problem into two half-sized sub-problems andevaluating two adjacent residuals at a line portion central region iscontinued until all elements of the un-normalised Kalman gain vector arecalculated. It is referred to as a “divide and conquer” approach: itreduces the number of computations required from O(N²) to O(Nlog₂N)because it is not necessary to evaluate residuals indicated by spacesbetween lines within the drawing.

For the case N=64 the set of interpolation residuals actually calculatedis illustrated in FIG. 7. As indicated earlier, vertical lines such as70 indicate residuals iterated downwards using Equation (42) andhorizontal lines such as 72 indicate residuals iterated to the rightusing Equation (41).

From inspection of FIG. 7, it is seen that the general approach is togenerate two orthogonal lines of residuals, select a residualapproximately half way along each line and iterate orthogonally to thedirection of the line, generating a respective new line of iterationsuntil the required value of the un-normalised Kalman gain vector isreached. This procedure is repeated using residuals approximately halfway to previously selected residuals and also using residualsapproximately half way along each new line of iterations.

Strictly speaking, “half way along each line” is inaccurate becausethere is no residual at the half way point: instead the half way pointis midway between two values and, in the case of the original predictionresiduals (upon the uppermost and leftmost axes 50 and 52 in FIG. 5), oneach side of the halfway point there are N/2 residuals (thirty-two forN=64). One begins an iteration with one of the two residuals immediatelyadjacent to and on opposite sides of a halfway point: the residual withwhich one begins is that having the higher value of the index which willnot be incremented in the next iteration. This is not actually essentialbut it is believed to result in a numerically robust procedure with thesmallest the number of iterative steps. In this connection, referringback to FIG. 6 once more, to produce the un-normalised Kalman gainvector the starting point was the forward prediction residual ε_(M,0)(t)and iteration was carried out for M−1 steps until the second or ƒindex(initially zero) became M−1. One could also begin with the backwardprediction residual ε_(0,M−1)(t) and iterate until the first or p indexbecomes M, but this requires one more iteration and does not help the“divide and conquer” process. Later iterations begin similarly with aresidual immediately adjacent a halfway point and having a higher valueof the index not to be incremented, and the relevant halfway point ishalfway between an interpolation residual providing a value of{circumflex over (k)}_(Q) ^(N)(t) obtained earlier and a startingresidual for an earlier iteration.

For filters with order N not equal to an integer power of two, thedivide and conquer principle described above dan still used but theproblem is divided up in a different way. Splitting each problem intotwo equal sub-problems will no longer be wholly appropriate to give acomplete solution. A problem or sub-problem can be split into two ormore not necessarily equal sub-problems: e.g. N=48 could be split intosub-problems of order 32 and 16 one of which yields 32 weight vectorelements and the other 16. Any integer value of N can be treated as asum of numbers each of which is a power of two and yields a subset ofthe required weight vector elements: hence the problem could be solvedin this way. Many other “divide and conquer” schemes are possible basedon the many different ways of dividing filter weight vectors with orderN not equal to an integer power of two into sub-weight-vectors withunequal orders.

Referring to Stage 16 of FIG. 1 the jth element of the Kalman gainvector is calculated using Equation 7: in this equation the numerator isan a posteriori interpolation residual ε_(N−j,j−1)(t−j+1) which is thejth element of the un-normalised Kalman gain vector. The a posterioriresidual will have been calculated as in FIGS. 3 and 4 as an a prioriresidual e_(N−j,j−1)(t−j+1) and a conversion factor δ_(N−j,j−1)(t−j+1).

The a posteriori residual itself is calculated asε_(N−j,j−1)(t−j+1)=e _(N−j,j−1)(t−j+1)δ_(N−j,j−1)(t−j+1)  (44)

The denominator in Equation 7 is of the form E_(N−j,j−1)(t−j+1) which isupdated at each iteration byE _(N−j,j−1)(t−j+1)=β₃ ² E _(N−j,j−1)(t−j)+e_(N−j,j−1)(t−j+1)ε_(N−j,j−1)(t−j+1)  (45)

Referring now to FIG. 8, an Nth order adaptive filter is shownimplemented electronically. It consists of a chain of delay cells 102 ₁,to 102 _(N−1) connected in series between respective pairs of connectionnodes 104 ₁, to 104 _(N). These nodes are connected to respectiveamplifiers 106 ₁, to 106 _(N) with respective amplification factorsw₁(t) to w_(N)(t) and connected to an update bus 108. The amplifiers 106provide output to a summer 110.

Each delay cell 102 provides a delay equal to the time betweensuccessive values of an input signal sample x(t) input to the first node104 ₁, first delay cell 102 ₁, and first amplifier 106 ₁. Upon clockactivation it outputs a signal sample input x(t−1) received on animmediately preceding clock cycle and inputs a new signal sample x(t).In consequence, when x(t) is input to the first cell 102 ₁ and firstamplifier 106 ₁, the ith (i=2 to N−1) cell 102 ₁, and ith amplifier 106_(i) (i=2 to N) receive input of x(t−i+1). The amplifiers haverespective gain factors which collectively apply a weight vector to thesignals x(t) to x(t−N+1): i.e. the ith amplifier 106 _(i) has a gain ofw_(i)(t) a time t and produces an output w_(i)(t)x(t−i+1). The summer110 sums the amplifier outputs, i.e. its output S₀ is:

$\begin{matrix}{S_{0} = {\sum\limits_{i = 1}^{N}{{w_{i}(t)}{x\left( {t - i + 1} \right)}}}} & (46)\end{matrix}$which is the required adaptive filter output as in Equation (1). Theweights w_(i)(t) are updated via the update bus 108 in accordance withEquations (3) and (4) given earlier.

Provided the Kalman gain vector is known, the equations given in theforegoing description can clearly be evaluated by an appropriatecomputer program comprising program instructions recorded on anappropriate carrier medium and running on a conventional computersystem. The carrier medium may be a memory, a floppy or compact oroptical disc or other hardware recordal medium, or an electrical signal.Such a program is straightforward for a skilled programmer to implementfrom the foregoing description without requiring invention, because itinvolves well known computational procedures. An outline for a programto calculate the Kalman gain vector will now be described assumingN=2^(x) where x is a positive integer. The code is written using therules of the MATLAB® computer program: the semicolon (;) indicates theend of an instruction, the epsis ( . . . ) indicates the instructioncontinues on the next line, the symbol ‘.*’ indicates element-wisemultiplication of two vectors: element-wise multiplication of twovectors a and b having respective elements a_(i), b_(i) is defined asforming a vector c with an i^(th) element c_(i) equal to the producta_(i)b_(i) of like-indexed pairs of elements a_(i), b_(i)(i=1, 2, 3, . .. etc).

The Nth order un-normalised Kalman gain vector (the symbol ‘k’ below) ata given time instance is given by the four line program. The conversionof the un-normalised Kalman gain vector into the true (i.e. normalised)Kalman gain vector according to Equation (10) is straightforward for askilled programmer to implement without requiring invention, because itinvolves well known computational procedures.

-   -   [km, deltakm, store state]=interpolate (N, fr, br, deltaf,        deltab, fr, br, beta, store, state);    -   kp=[fr(N), km, br (N)];    -   deltak=[deltaf(N), deltakm, deltab(N)];    -   k=kp .* deltak;        where the N-dimensional vectors ‘fr’ and ‘br’ contain the a        priori forward and backward predictions residuals of order 0 to        N−1 for that time instance, deltaf, deltab are N-dimensional        vectors containing the corresponding conversion factors linking        the a priori residuals to the a posteriori residuals (the a        posteriori residual is equal to the product of the a priori        residual and the conversion factor) and the variables ‘store’        and ‘state’ are used to store intermediate quantities and are        arrays of dimension N×N×2 and N×N×4 respectively. The forward        and backward prediction residuals and the conversion factors can        be produced as previously described using for example a        QRD-based LSL program. The intermediate quantities ‘store’ and        ‘state’ need to be initialised before running the program for        the first time. This will be described later.

Defining n=2m and m being an integer not less than 2, the function‘interpolate’ calculates an (n−2)-dimensional vector of interpolationresiduals that constitute the diagonal of a triangular region in thespace depicted in FIG. 4 defined by the horizontal line 50 and thevertical line 52. The two residuals near the centre of the diagonal arecalculated using the recursions (41) and (42) via calls to the functions‘across’ and ‘down’. If n is greater than 4, the remaining diagonalelements are calculated by two further (recursive) calls to the function‘interpolate’ consisting of data of dimension n/2 (the “divide andconquer” technique). The function ‘interpolate’ is:

-   -   function [k, deltak, store, state]=interpolate(n, v, h, deltav,        deltah, fr, br, beta, store0, state0) store=store0;        state=state0;    -   if (n>4)        -   [v1, deltav1, state(1:n/2,n/2+1,:)]= . . . down(n/2,            h(n/2+1), deltah(n/2+1), br(n/2+1:n), beta,            state(1:n/2,n/2+1,:));        -   [h1, deltah1, store((n/2+1),1:(n/2),:),            state((n/2+1),1:(n/2),:)]= . . . across(n/2, v(n/2+1),            deltav(n/2+1), fr((n/2+1):n), beta,            store((n/2+1),1:(n/2),:), . . . state((n/2+1),1:(n/2),:));        -   [ku, deltaku, store(1:n/2,n/2+1:n,:),            state(1:n/2,n/2+1:n,:)]= . . . interpolate(n/2, v1,            h(n/2+1:n), deltav1, deltah(n/2+1:n), fr(n/2+1:n),            br(n/2+1:n), . . . beta, store(1:n/2,((n/2)+1):n,:),            state(1:n/2,n/2+1:n,:));        -   [k1, deltak1, store(n/2+1:n,1:n/2,:),            state(n/2+1:n,1:n/2,:)]= . . .            interpolate(n/2,v(n/2+1:n),h1, deltav(n/2+1:n), deltah1,            fr(n/2+1:n), br(n/2+1:n), . . . beta,            store(((n/2)+1):n,1:n/2,:), state(n/2+1:n,1:n/2,:));        -   k=[k1, h1(n/2), v1(n/2), ku];        -   deltak=[deltak1, deltah1(n/2), deltav1(n/2), deltaku];    -   else        -   [v1, deltav1, state(1:n/2,n/2+1,:)]= . . . down(n/2,            h(n/2+1), deltah(n/2+1), br(n/2+1:n), beta,            state(1:n/2,n/2+1,:));        -   [h1, deltah1, store(n/2+1,1:n/2,:), state(n/2+1,1:n/2,:)]= .            . . across(n/2, v(n/2+1), deltav(n/2+1), fr(n/2+1:n), beta,            store(n/2+1,1:n/2,:), state(n/2+1,1:n/2,:));        -   k=[h1(n/2), v1(n/2)];        -   deltak=[deltah1(n/2), deltav1(n/2)];    -   end

The function ‘down’ implements the iteration (42) i.e. starting from aresidual of the form ε_(p,f)(t−ƒ) (actually represented by the a prioriresidual h=e_(p,ƒ)(t−ƒ) and the conversion factor delta0=δ_(p,f)(t−ƒ))and produces the sequence of residuals ε_(p+i,f)(t−ƒ) for i=1 to n−1(again, represented by a priori residuals and conversion factors storedin v and delta1 respectively).

The function ‘update’ is common to the function ‘across’ both of whichare described below. The function ‘down’ is

-   -   function [v, delta11, state]=down (n, h, delta0, br, beta,        state0) state=state0;    -   v(1)=h; delta1(1)=delta0;    -   for i=2:n,        -   [v(i), delta1(i), state(i,1,:)]=update (v(i−1), delta1(i−1),            br(i), beta, state(i,1,:));    -   end

The function ‘across’ implements iteration (41) i.e. starting from aresidual of the form ε_(p,f)(t−ƒ) (actually represented by the a prioriresidual v=e_(p,f)(t−ƒ) and the conversion factor delta0=δ_(p,f)(t−ƒ))and produces the sequence of residuals ε_(p,f+i)(t−ƒ−i) for i=1 to n−1(again, represented by a priori residuals and conversion factors storedin h and delta1 respectively). Extra storage is needed compared to thefunction ‘down’ in order to implement the change in time index. Thefunction ‘update’ is common to the function ‘down’ and is describedbelow. The function ‘across’ is

-   -   function [h, delta1, store, state]=across(n, v, delta0, fr,        beta, store0, state0) state=state0; store=store0;    -   h(1)=v; delta1(1)=delta0;        -   for i=2:n,        -   [h(i), delta1(i), state(1,i,:)]=update (store(1,i,1),            store(1,i,2), fr(i), beta, state(1,i,:));        -   store(1,i,:)=[h(i−1), delta1(i−1)];    -   end

The function ‘update’ takes a residual ε_(p,f)(t−ƒ) (actuallyrepresented by the a priori residual x=e_(p,f)(t−ƒ) and the conversionfactor delta=δ_(p,f)(t−ƒ)) and increases either the index ‘f’ or theindex ‘p’ by one depending upon the value of the input ‘r’. No‘switching’ is required: if ‘r’ is the appropriate backward predictionresidual then the index ‘p’ is increased (see Equations (24) and (33));conversely if ‘r’ is the appropriate forward prediction residual thenthe index ‘f’ is increased (see equations (25) and (34)). As describedearlier this requires that an RLS and a modified RLS problem be solvedThis is done in the functions ‘rls’ and ‘mrls’ respectively. Thefunction ‘update’ is

-   -   function [y, delta1, state1]=update (x, delta, r, beta, state)        -   d=state(1); k=state(2);        -   dm=state(3); km=state(4);        -   [z,dm1,km1]=mris (x, r, delta,dm,beta,km);        -   [y,delta1,d1,k1]=ris (z, x,delta,d,beta,k);        -   state1(1)=d1; state1(2)=k1;        -   state1(3)=dm1; state1(4)=km1;

The function ‘rls’ solves a RLS problem as described in Equations (28)to (32) inclusive. The function ‘rls’ is

-   -   function [z,delta1,d1,k1]=ris (x, y,delta,d,beta,k)        -   if ((x==0)|(delta==0))        -   d1=beta^2*d;        -   s=0;        -   delta1=delta;    -   else        -   d1=beta^2*d+delta*abs(x)^2;        -   s=delta*x/d1;        -   delta1=beta^2*delta*d/d1;    -   end    -   z=y+k*x;    -   k1=k−s*z;

The function ‘mrls’ solves a modified RLS problem as described inEquations (36) to (40) inclusive. The function ‘mrls’ is

-   -   function [y,d1,k1]=mris (x, z,delta,d,beta,k)    -   if ((x==0)|(delta==0))        -   d1beta^2*d;        -   s=0;    -   else        -   d1=beta^2*d+delta*abs(x)^2;        -   s=delta*x/d1;    -   end    -   y=z−k*x;    -   k1=k−s*z;

In the above program listing and in the description of the associatedmathematics, there are certain quantities that are updated from timeinstance to time instance e.g. k(t−1)→k(t). Hence an initial value mustbe specified. An example of a way to perform this initialisation is asfollows: Firstly, all stored variables involved in the computation areset to zero with the exception of the stored conversion factors whichare set to unity. Then the program or hardware instantiation, up to butnot including the normalisation stage, is arranged to process aninitialising input time series consisting of a one followed by N zeros,where as before N is the order of the digital filter. This proceduregives appropriate initial values as required, after which a real timeseries can be processed.

In recent years a new type of algorithm (termed a ‘Fast Newton’algorithm) has appeared in the academic literature: see e.g.

-   -   M. Moonen and I. K. Proudler, “Using a Lattice Algorithm to        Estimate the Kalman Gain Vector in Fast Newton-type Adaptive        Filtering”, Proc. ICASSP'97, Munich, April 1997    -   G. V. Moustakides & S. Theodoridis, “Fast Newton Transversal        Filters—A New Class of Adaptive Estimation Algorithms”, IEEE        trans. SP-39(6); pp. 2184-2193. 1991.    -   D. K. Phillips and C. F. N. Cowan, “Zero-Phase Signal        Conditioning for Improved NLMS Performance”, 13th Int. Conf. on        DSP, 2-4 Jul. 1997, Santorini, Greece, pp.37-39.    -   K. Maouche and D. T. M. Slock, “A Fast Instrumental Variable        Affine Projection Algorithm”, Proc. ICASSP'98, Seattle, Wash.,        USA, pp. 1481-4.

Fast Newton algorithms are based on RLS algorithms and on an assumptionthat the input signal to be filtered can be modelled as a low orderautoregressive (AR) process. This assumption allows some of thecomputation in the RLS algorithm to be ignored with potentially verylittle loss in convergence performance but a large saving incomputation. In accordance with the invention it has been discoveredthat in order for this approach to be viable it is necessary for the RLSalgorithm in use to be of a type that produces filter coefficientsexplicitly—which excludes the use of RLS lattice algorithms known beforethis invention.

It is also advantageous, especially in acoustic applications, for thealgorithm to be convertible into a Fast Newton (U) algorithm in order tobe able to take advantage of any autoregressive nature in the inputsignal. It can be shown that the present invention is suitable forimplementation using an FN approach.

The invention is suitable for use in a variety of applications: suchapplications for adaptive filters are given by S Haykin, Adaptive FilterTheory, 2nd Edition, Prentice-Hall, Englewood Cliffs, N.J., USA, 1991.One such is system identification—here the requirement is to observe areal system and generate a computer model of its operation. The modelcan be used for many things such as health monitoring, developing acontrol system or to design a distortion compensation processor (e.g.for manipulating the acoustic character of rooms). System identificationis achieved by monitoring the input and output of the system and feedingthese signals as input data in digital electronic form to the adaptivefilter implemented on a computer system. Although not strictlynecessary, having the filter weights rather than some otherparameterisation of the system is usually considered desirable as theyare easier to interpret and manipulate.

Adaptive filters are also applicable to communications channelequalisation: in modem digital communication systems a high data rate isoften a requirement, where ‘high’ means high having regard to themaximum data rate of the communications channel in use. Individual databits in the system may be sent so close together that their signal waveforms merge due characteristics of the communications channel. Thisadversely affects the ability of a receiver in the system to recover theinformation correctly. The standard approach is to ‘equalise’ thechannel. This can be done either via a channel equalisation filter (anadaptive filter) or via Maximum Likelihood Sequence Estimation (usuallyimplemented using the Viterbi algorithm). Such equalisation is needed,for example, in the GSM mobile telephone system, digital TVtransmissions over cable networks, and data modems for telephone lines.

One of the most effective equalisation filter structures is the‘Decision Feedback Equaliser’. It uses previous decoding decisions(assumed to be correct) to counteract communications channel effects. Itrequires the ability to filter received signals independently of anyupdating mechanism for filter weights. Obtaining the filter weightsrather than some other parameterisation of the filter makes this taskeasier to implement.

Maximum Likelihood Sequence Estimation is a very powerful means ofequalising a communications channel and is used in the GSM mobiletelephone system. Unlike channel equalisation filtering, it works bydirectly estimating a transmitted information signal in a communicationschannel: to achieve this an estimate of the distortion introduced by thechannel is required and can be provided by an adaptive filter (cf.system identification above). Again it is desirable to model thecommunications channel directly in terms of filter weights, as opposedto a less direct approach.

In some applications, e.g. data modems for telephone lines, although theappropriate equalisation filter is unknown a priori and hence must bedetermined by an adaptive filter, once it is known it changes littlewith time. Here there is little point in continually updating the filterweights. The simplest alternative is to separate filtering fromcalculation of filter weights (sometimes called ‘offline processing’) sothat the latter is, done only when required. This gives an additionaladvantage for high speed data, when calculation of filter weights doesnot need to cope with the data rate even though filtering must do so.Again, obtaining the filter weights themselves rather than some otherfilter parameterisation makes this approach easier to implement.

In contrast to the aforementioned unknown but unchanging case, in somesituations, for example HF radio communication, communication channelproperties may vary with time requiring continual updating ofequalisation filter weights. This may be achieved in two basic ways: bythe use of either recursive or block adaptive filters. A recursiveadaptive filter continually updates filter weights as new data arrives.A block adaptive filter partitions data into blocks and processes eachblock independently of the others. Recursive adaptive filters may berequired to process many millions of data samples, and it is thereforevital that they behave correctly even when implemented with computerarithmetic circuits. Block adaptive filters are less susceptible to thisproblem since each data block represents only a limited amount of datato be processed at one time. However block adaptive filters have severaldrawbacks such as extra storage requirements for the blocks of data andthe need for smooth ‘joining’ of the resulting processed blocks of datatogether.

Constrained adaptive filters are known. They have a variety ofapplications such as minimum variance spectral analysis and forprocessing data from antenna arrays. Here an adaptive filter is requiredto minimise some signal subject to a constraint on the filter weightssuch as zero response at a particular frequency. Such adaptive filterscan be updated efficiently using the Kalman gain vector. The relevantparts of the invention described herein can thus be used in aconstrained adaptive filter.

In the example of the invention described with reference to FIG. 1 etsequi, the adaptive filter is shown as consisting of four blocks: RLSLattice Algorithm 12, Interpolation Residual Calculation 14,Normalisation 16, and Weight Update 18: highly detailed examples of theblocks 12 to 18 have been described in the embodiment of the inventionhereinbefore set out. However, with the benefit of the embodimentdescribed it will be clear to those of ordinary skill in the art offiltering that there are many different ways in which the blocks 12 to18 can be implemented.

The embodiment described herein derives the Kalman gain vector itself,ignoring arithmetic inaccuracies, i.e. rounding errors. It is known thatadaptive filters are realisable (possibly with reduced performance) ifthe Kalman gain vector is replaced by an approximation to it provided byone of a variety of alternative gain vectors: see e.g. the literature onFast Newton Algorithms referred to above. The advantage of thesealternative gain vectors is that they can be calculated with feweroperations than the Kalman gain vector.

In the example of the invention, the first step of generatinginterpolation residuals is to generate least squares (LS) predictionresiduals via a recursive LS Lattice (RLSL) algorithm. The RLSLalgorithm could be replaced by any other method of generating predictionresiduals. Other algorithms which could be used to generate LSprediction residuals are RLS algorithms such as the Conventional RLSalgorithm, the Fast Transversal Filter FTF) algorithm or the QR RLSalgorithm.

One could also use Non-RLS algorithms such as the Least Mean SquaresGEMS) algorithm, the Gradient Adaptive Lattice (GAL) algorithm, aClamped LSL algorithm or a Fast Newton Transversal Filter (FNTF)algorithm. In this case (at least some of) the residuals would beapproximate RLS prediction residuals. The use of a non-RLS algorithmwill lead to a gain vector that is not identical to the Kalman gainvector.

Prediction residuals need not be the only input to interpolationresidual calculation at 14 in FIG. 1. The basis of interpolationresidual calculation 14 is that iterations increase interpolationresiduals' order—their p and f subscripts. Prediction residualscorrespond to p=0 or f=0, and hence are possible starting variables.Clearly one can begin from any suitable set of variables. For example, aset of interpolation residuals with p=1, and a set of residuals withf=1, could be calculated (or approximated) by any means, and these couldthen be used to generate the desired interpolation residuals which makeup the un-normalised Kalman gain vector. One could use for example therecursive interpolation algorithm given by J T Yuan, “QR decompositionbased least squares lattice interpolators”, IEEE Trans SP January 2000vol 48(1) pages 70-79.

By suitable mathematical manipulation of Equations (24) and (25), it isalso possible to envisage the calculation of interpolation residuals byiterations that decrease the interpolation residuals' order—their p andf subscripts. Hence the residuals that constitute the un-normalisedKalman gain vector could be obtained from many other residuals asstarting variables. In particular it is possible to replace the divideand conquer structure in Stage 14 of FIG. 1 with a concatenation ofprocessing stages which “zig-zag” (see FIG. 5) from the first element tothe last element of the un-normalised Kalman gain vector. The set ofresiduals generated between these two extremes can include the residualscorresponding to all other elements of the un-normalised Kalman gainvector. This method requires fewer operations than the divide andconquer process used in the preferred embodiment but has not been usedtherein because it has inferior numerical properties when performedusing finite precision arithmetic.

There are several options for generating the un-normalised Kalman gainvector other than the divide and conquer approach described above. Forexample the same vector could be calculated from Equation (42) startingfrom backwards prediction residuals. This would require O(N²) operationsas opposed to the O(Nlog₂N) operations of the divide and conquer method.An interpolation residual may be calculated using basic iterations(whereby the order of the interpolation residuals (the p and fsubscripts) is increased by one) from an interpolation residual withlower indices via a variety of routes. If RLS iterations are used theresult is substantially the same regardless of the route taken.

There are therefore many potential sets of iterations which may beperformed to generate the un-normalised Kalman gain vector. Furthermore,there is a whole range of other vectors which could be generated insteadof the un-normalised Kalman gain vector. The Kalman gain vector is thevector used to update the filter weights in a RLS algorithm. There aremany other gain vectors which could be used to update the filter weightsother than the Kalman gain vector. An algorithm with a gain vector whichis close to the Kalman gain vector is expected to perform similarly to aRLS algorithm.

Approximations to the Kalman gain vector could be generated for exampleby replacing some (or all) of the iterations which increase theinterpolation order by computationally cheaper iterations. For example,the iterations could be replaced by their LMS equivalents, or any otherapproximations. In some cases the computation could reduce to merely atime delay or nothing at all (NB this is based on the same underlyingidea behind the ‘Fast Newton’ algorithm mentioned above). This isequivalent to making the approximation that a higher order interpolationresidual is equal to a lower order interpolation residual.

Referring to FIG. 5 once more, a second possibility is that, rather thantrying to calculate the interpolation residuals on the diagonal 56 a/56b, one could use instead the residuals that form a horizontal line 80starting from the bottom-left element of the Kalman gain vector. Thiscorresponds to a vector consisting of the elements ε_(N−1,ƒ)(t−ƒ) withƒ=0, . . . , N−1.

In the example the un-normalised Kalman Gain vector is normalised bydividing each element of the vector by its power; The power of eachresidual which is an element of the un-normalised Kalman Gain vector iscalculated via Equation (10). This quantity could be approximated forexample from just the a priori residuals or the a posteriori residualsor via many other means. Alternative normalisation factors could be usedto alter an algorithm's convergence characteristics: for examplechoosing a larger normalisation factor will slow down convergence, butit allows filter weights to become closer to their ideal values onceconverged.

The foregoing example is based on the Kalman gain vector defined byEquation (5). In the prior art RLS algorithms are known that use theso-called “dual Kalman-gain vector”. This is a scaled version of theKalman gain vector which is based on normalised a priori interpolationresiduals rather than a posteriori residuals. It requires a slightlymodified weight updating formula but it is also an RLS algorithm.Approximations (such as those mentioned above) to the dual Kalman-gainvector could also be used.

As described above in detail, it has been discovered in accordance withthe invention that a Kalman gain vector can be derived as a set ofnormalised interpolation residuals which can be used in an O(Nlog₂N)adaptive filter. However, the Kalman gain is an integral part of oneform of the ubiquitous recursive least squares (RIS) procedure. Apartfrom applications such as noise cancellation and communications channelequalisation which use adaptive filtering, the filtering procedure (andapproximate alternatives) can be used in adaptive beamforming. It canalso be used to construct multichannel adaptive filters which aresometimes used in sonar applications where they are known as broadbandbeamformers. The residuals that make up the Kalman gain vector areconventionally referred to in recent prior art as “interpolationresiduals” only if the data being processed is in the form of a timeseries. If data is taken from a set of spatial distributed sensors forexample, corresponding residuals do not have an accepted commondescriptive name even though the basic signal processing approach stillholds good For the purposes of the present invention, the expression“interpolation residual” shall be taken as referring to residualsobtained by interpolation from a sequence of data elements distributedover any dimension without limitation, i.e. time or space or any other.

Although it is as yet unverified for all practical situations, fromtheoretical considerations it is believed that—where filter parametersare updated using Equation (3)—calculating the Kalman gain vector viainterpolation residuals is numerically more robust than a conventionalapproach with similar computational requirements.

The foregoing example used real valued data. The invention can also beapplied to complex valued data such as in-phase and quadrature (I and Q)channels of data used in digital communications or radar systems.

Whereas it is preferred to obtain the Kalman gain vector with adivide-and-conquer approach of O(Nlog₂N) operations, it is quitepossible to use some other method to obtain interpolation residuals suchas one of the O(N²) methods described earlier if the resulting increasein number of operations is acceptable: The latter will still have anadvantage because as indicated above for good theoretical reasons it islikely to be more numerically robust than alternative approaches.

1. An adaptive filter implemented by means of filter weights, the filterincluding means for providing a sequence of signal samples forapplication to the filter and means for updating the filter weights bymeans of a gain vector derived from interpolation residuals of thesequence of signal samples applied to the filter.
 2. A method foradaptive filtering with filter weights comprising the steps of: a)deriving interpolation residuals from a sequence of signal samplesprovided as data for filtering, and b) updating the filter weights witha gain vector derived from the interpolation residuals.
 3. A methodaccording to claim 2 wherein the gain vector is a Kalman gain vector. 4.A method for adaptive filtering including: a) processing an inputsequence of signal samples to derive prediction residuals; b) convertingprediction residuals to interpolation residuals; c) deriving elements ofa gain vector from the interpolation residuals; and d) combining thegain vector with input and reference signals and update the filtercoefficients or weights as required to provide adaptive filtering.
 5. Amethod according 4 claim wherein the prediction residuals are leastsquares prediction residuals.
 6. A method according to claim 5 forobtaining prediction residuals by processing a sequence of signalsamples using a recursive least squares lattice (RLSL) algorithm.
 7. Amethod according to claim 4 including converting prediction residuals tointerpolation residuals corresponding to gain vector elements by aniterative approach in which each iteration changes an index of aresidual or of an intermediate quantity derived therefrom.
 8. A methodaccording to claim 7 wherein the iterative approach is a divide andconquer approach which treats prediction residuals as interpolationresiduals and changes the index appropriately to convert the predictionresidual to an interpolation residual providing an clement of the gainvector in un-normalised form.
 9. A method according to claim 8 whereinthe prediction residuals are least squares prediction residuals and theiterative approach treats the prediction residuals as interpolationresiduals with zero values for one of two indices corresponding toabsence of succeeding and preceding time series signal samplesrespectively and changes the zero index in each case sufficiently toconvert the forward or backward residual to an interpolation residualwhich is also an element of the gain vector in un-normalised form.
 10. Amethod according to claim 7 wherein the iterative approach also treatsas an intermediate result an iteration in a sequence thereof leading toan element of the gain vector and changes an index of the intermediateresult sufficiently to convert such result to an interpolation residualcorresponding to an element of the gain vector.
 11. A method accordingto claim 10 implementing a digital filter of order N where N is equal to2^(x) and x is an integer, including iteration in a sequence to generatean un-normalised element of the gain vector, the iteration beginningwith use of one of two residuals immediately adjacent and on oppositesides of a halfway point, the residual of use being that having arelatively higher value of an index not to be altered in the iterationsequence, and the relevant halfway point being halfway between twosequence, and the relevant halfway point being halfway between twoquantities: e) one of which is an interpolation residual and the other amember of the relevant time series for which the gain vector isgenerated, or f) one of which is an interpolation residual and the othera starting point for an earlier iteration, or g) which are respectivelystarting and end points for an earlier iteration.
 12. A method accordingto claim 10 implementing a digital filter of order N where N is equal toa sum of integers each of which is a power of two, wherein the iterativeapproach involves treating the filter as a combination of filters eachof order a respective power of two.
 13. A method according to claim 12including iteration in a sequence to generate an un-normalised elementof the gain vector, the iteration beginning with use of one of tworesiduals immediately adjacent and on opposite sides of a halfway pointthe residual of use being that having a relatively higher value of anindex not to be altered in the iteration sequence, and the relevanthalfway point being halfway between two sequence, and the relevanthalfway point being halfway between two quantities: h) one of which isan interpolation residual and the other a member of the relevant timeseries for which the gain vector is generated, or i) one of which is aninterpolation residual and the other a starting point for an earlieriteration, or j) which are respectively starting and end points for anearlier iteration.
 14. A method according to Claim 4 includingconverting prediction residuals to interpolation residuals correspondingto elements of a gain vector by a QR decomposition approach in whichequations of the form: k) ζ(i)=ψ(i)+k(i)ξ(i) are solved for ζ(i) andk(i) given ψ(i) and ξ(i) subject to a constraint that in so doing thereis minimisation of a sum of squares of ψ(j)+k(i)ξ(j) obtained fordifferent sample indexes j, where ζ(i) is an a posteriori interpolationresidual, ψ(i) and ξ(i) are a posteriori modified prediction andinterpolation residuals respectively, and k(i) is a coefficient to bedetermined; l) ψ(i)=ζ(i) −k(i)ξ(i) are solved for ψ(i) and ξ(i) k(i)given ζ(i) and ξ(i) subject to a constraint that in so doing the valueof k(i) that is obtained is substantially that which would be obtainedin solving ζ(i)=ψ(i)+k(i)ξ(i) for ζand k(i).
 15. A method according toclaim 14 wherein the QR decomposition employs square root freeequivalents of sine and cosine rotation parameters.
 16. A computersoftware product comprising a computer readable medium containingcomputer readable instructions for controlling operation of computerapparatus to implement an adaptive filter, wherein the computer readableinstructions provide a means for controlling the computer apparatus toinput a sequence of signal samples as data for filtering, and thecomputer readable instructions also provide a means for controlling thecomputer apparatus to generate updated filter weights by means of a gainvector derived from interpolation residuals of sequence of signalsamples.
 17. A computer software product according to claim 16 whereinthe gain vector is a Kalman gain vector.
 18. A computer software productcomprising readable medium containing computer readable instructions forcontrolling operation of computer apparatus to implement an adaptivefilter wherein the computer readable instructions provide a means forcontrolling the computer apparatus to: m) process an input sequence ofsignal samples to derive prediction residuals; n) convert predictionresiduals to interpolation residuals corresponding to elements of a gainvector; and o) derive elements of a gain vector from the interpolationresiduals; and p) combine the gain vector with input and referencesignals and update the filter coefficients or weights as required toprovide adaptive filtering.
 19. A computer software product according toclaim 18 wherein the prediction residuals are least squares predictionresiduals.
 20. A computer software product according to claim 18 whereinthe computer readable instructions provide a means for controlling thecomputer apparatus to obtain prediction residuals by processing asequence of signal samples using a recursive least squares lattice(RLSL) algorithm.
 21. A computer software product according to claim 18wherein the computer readable instructions provide a means forcontrolling the computer apparatus to convert prediction residuals tointerpolation residuals corresponding to gain vector elements by aniterative approach in which each iteration changes an index of aresidual or of an intermediate quantity derived therefrom.
 22. Acomputer software product according to claim 21 wherein the computerreadable instructions provide a means for controlling the computerapparatus to implement the iterative approach by a divide and conquerapproach which treats prediction residuals as interpolation residuals,and by proceeding with iteration until the index is changedappropriately to convert the prediction residual to an interpolationresidual providing an element of the gain vector in un-normalised form.23. A computer software product according to claim 22 wherein theprediction residuals are least squares prediction residuals and thecomputer readable instructions provide a means for controlling thecomputer apparatus to implement the iterative approach by treating theprediction residuals as interpolation residuals with zero values for oneof two indices corresponding to absence of succeeding and preceding timeseries signal samples respectively, and by proceeding with iterationuntil the zero index in each ease is changed sufficiently to convert theforward or backward residual to an interpolation residual which is alsoan element of the gain vector in un-normalised form.
 24. A computersoftware program according to claim 21 wherein the computer readableinstructions provide a means for controlling the computer apparatus toimplement the iterative approach by treating as an intermediate resultan iteration in a sequence thereof leading to an element of the gainvector, and by proceeding with iteration until an index of theintermediate result is changed sufficiently to convert such result to aninterpolation residual corresponding to an element of the gain vector.25. A computer software product according to claim 24 for use inimplementing a digital filter of order N where N is equal to 2^(x) and xis an integer, wherein the computer readable instruction provide a meansfor controlling the computer apparatus to implement iteration in asequence to generate an un-normalised element of the gain vectorbeginning with use of one of two residuals immediately adjacent and onopposite sides of a halfway point, the residual of use being that havinga relatively higher value of an index not to be altered in the iterationsequence, and the relevant halfway point being halfway between twoquantities: q) one of which is an interpolation residual and the other amember of the relevant time series for which the gain vector isgenerated, or r) one of which is an interpolation residual and the othera starting point for an earlier iteration, or s) which are respectivelystarting and end points for an earlier iteration.
 26. A computersoftware product according to claim 24 for use in implementing a digitalfilter of order N where N is equal to a sum of integers each of which isa power of two, wherein the computer readable instructions provide ameans for controlling the computer apparatus to implement the iterativeapproach by treating the filter as a combination of filters each oforder a respective power of two.
 27. A computer software productaccording to claim 26 wherein the computer readable instructions providea means for controlling the computer apparatus to implement iteration ina sequence to generate an un-normalised element of the gain vectorbeginning with use of one of two residuals immediately adjacent and onopposite sides of a halfway point, the residual of use being that havinga relatively higher value of an index not to be altered in the iterationsequence, and the relevant halfway point being halfway between twoquantities: t) one of which is an interpolation residual and the other amember of the relevant time series for which the gain vector isgenerated, or u) one of which is an interpolation residual and the othera starting point for an earlier iteration, or v) which are respectivelystaring and end points for an earlier iteration.
 28. A computer softwareproduct according to claim 18 wherein the computer readable instructionsprovide a means for controlling the computer apparatus to convertprediction residuals to interpolation residuals corresponding toelements of a gain vector by a QR decomposition approach in whichequations of the form: w) ζ(i)=ψ(i)+k(i)ξ(i) are solved for ζ(i) andk(i) given ψ(i) and ξ(i) subject to a constraint that in so doing thereis minimization of a sum of squares of ψ(j)+k(i)ξ(j) obtained fordifferent sample indexes j, where ζ(i) is an a posteriori interpolationresidual, ψ(i) and ξ(i) are a posteriori modified prediction andinterpolation residuals respectively, and k(i) is a coefficient to bedetermined; and x) ψ(i)=ζ(i) −k(i)ξ(i) are solved for ψ(i) and ξ(i) k(i)given ζ(i) and ξ(i) subject to a constraint that in so doing the valueof k(i) that is obtained is substantially that which would be obtainedin solving ζ(i)=ψ(i)+k(i)ξ(i) for ζand k(i).
 29. A computer softwareproduct according to claim 28 wherein the QR decomposition employssquare root free equivalents of sine and cosine rotation parameters.